diff --git a/.gitignore b/.gitignore index de2c13f..988caf3 100644 --- a/.gitignore +++ b/.gitignore @@ -105,3 +105,6 @@ debug/ store-assets/ .playwright-mcp/ + +src/reverse_api/cursor_bridge/node_modules/ + diff --git a/CHANGELOG.md b/CHANGELOG.md index 9314915..654068e 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,7 +7,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] -## [0.9.0] - 2026-05-14 +### Added + +- **`agent_provider: "agent-browser"`**: Shell-driven **[Vercel agent-browser CLI](https://github.com/vercel-labs/agent-browser)**—RAE prefers an `agent-browser` binary on `PATH`, otherwise runs **`npm install -g `** (with a console notice), validates **`--help`**, and only then falls back to **`npx -y `** if npm cannot install. Prompts embed the resolved shell prefix plus **`skills get …` / `skills list`**, HAR flows, and optional `agent_browser_notes`. No bundled browser MCP shim; pin via `agent_browser_npx_package` / `RAE_AGENT_BROWSER_PACKAGE`. ### Added - **Cursor SDK support**: Added `sdk=cursor` / `--sdk cursor` engineering support through a bundled Node bridge around the Cursor TypeScript SDK. Cursor runs use the configured Cursor model (default `composer-2`), accept MCP server configuration, resume Cursor agents across follow-up turns, and normalize streamed tool output plus token usage into the existing TUI/message-store flow diff --git a/README.md b/README.md index dd6a0c9..14cda7c 100644 --- a/README.md +++ b/README.md @@ -55,13 +55,22 @@ Cycle modes with **Shift+Tab**: | Mode | What it does | |------|--------------| | `manual` | You drive the browser; AI generates the client from captured traffic. | -| `agent` | An AI agent drives the browser autonomously (Playwright MCP or Chrome DevTools MCP). | +| `agent` | An AI agent drives capture autonomously (Playwright or Chrome MCP, or Vercel agent-browser CLI). | | `engineer` | Re-run generation on a previous capture (`engineer `). | | `collector` | Agent collects structured data (JSON/CSV) using web search + fetch. | Agent mode providers: - **auto** (default): Playwright MCP, single workflow for browsing + reverse engineering. - **chrome-mcp**: drives your real Chrome so you keep existing sessions/cookies. Requires Chrome 146+ and Node.js 20.19+. +- **agent-browser**: [Vercel agent-browser](https://github.com/vercel-labs/agent-browser) **CLI** (not a Reverse API Engineer browser MCP server). At session start RAE uses whatever `agent-browser` is already on `PATH`, otherwise runs **`npm install -g `** (same pin as config / `RAE_AGENT_BROWSER_PACKAGE`), prints a yellow notice, validates with **`--help`**, and only then falls back to **`npx -y `** if npm cannot install. Prompts embed the resolved shell prefix alongside **`skills get core --full`**, **`skills list`**, HAR phases, cloud notes from `agent_browser_notes`. Tune with `agent_browser_npx_package` (optional), env `RAE_AGENT_BROWSER_*`. First Chromium fetch: **`agent-browser install`** (add `--with-deps` on trimmed Linux). + + +Optional sanity checks: + +```bash +agent-browser doctor --offline --quick || true +agent-browser skills list >/dev/null +``` ## Configuration @@ -70,6 +79,8 @@ Settings live in `~/.reverse-api/config.json` and can be edited via `/settings` ```json { "agent_provider": "auto", + "agent_browser_npx_package": "agent-browser@0", + "agent_browser_notes": "", "claude_code_model": "claude-sonnet-4-6", "collector_model": "claude-sonnet-4-6", "opencode_model": "claude-sonnet-4-6", @@ -119,7 +130,7 @@ Pass `--no-interactive` (and/or `--json`) to skip prompts. With `--json`, stdout | `run_id` | `string` \| `null` | Use with `show` / `engineer` / `run`. | | `prompt` | `string` | | | `url` | `string` \| `null` | | -| `mode` | `string` \| `null` | `"auto"` or `"chrome-mcp"`. | +| `mode` | `string` \| `null` | `"auto"`, `"chrome-mcp"`, or `"agent-browser"`. | | `har_path` | `string` \| `null` | Captured HAR. | | `script_path` | `string` \| `null` | Generated client. | | `usage` | `object` | `{input_tokens, output_tokens, total_cost}`. | diff --git a/src/reverse_api/agent_browser.py b/src/reverse_api/agent_browser.py new file mode 100644 index 0000000..0b78cb0 --- /dev/null +++ b/src/reverse_api/agent_browser.py @@ -0,0 +1,256 @@ +"""Helpers for agent-browser provider: bootstrap + prompt context. + +Prefer a globally installed ``agent-browser`` on ``PATH``. If missing, run +``npm install -g `` (from config / ``RAE_AGENT_BROWSER_PACKAGE``) so the daemon +pairs consistently with Chromium; notify the operator when Reverse API Engineer performs +that install. Fall back to ``npx -y `` only when npm/global install cannot run. +Models invoke the CLI from shell tooling (for example Bash on Claude SDK runs). +""" + +from __future__ import annotations + +import os +import shlex +import shutil +import subprocess +from dataclasses import dataclass +from typing import Any + +from .utils import get_config_path + +_AGENT_BROWSER_TOOLS = frozenset( + { + "Read", + "Write", + "Edit", + "Glob", + "Grep", + "Bash", + "WebFetch", + "WebSearch", + "AskUserQuestion", + }, +) + +_SHELL_INVOKER: str | None = None + + +@dataclass(frozen=True) +class AgentBrowserSetup: + """Outcome of resolving the CLI before an agent-browser session.""" + + error: str | None = None + notices: tuple[str, ...] = () + + @property + def ok(self) -> bool: + return self.error is None + + +def reset_agent_browser_setup_cache() -> None: + """Clear cached shell invocation (for tests only).""" + + global _SHELL_INVOKER + _SHELL_INVOKER = None + + +def print_agent_browser_setup_notices(console, setup: AgentBrowserSetup) -> None: + """Emit install/fallback chatter before streaming.""" + + for note in setup.notices: + console.print(f"\n[yellow]agent-browser[/yellow]: {note}") + + +def _config_manager_snapshot() -> dict[str, Any]: + """Load config defaults merged with ~/.reverse-api/config.json.""" + + try: + from .config import ConfigManager + + cm = ConfigManager(get_config_path()) + return cm.config.copy() + except Exception: + return {} + + +def agent_browser_npx_package() -> str: + """Pinned npm specifier passed to ``npm install -g`` / ``npx -y`` (``RAE_AGENT_BROWSER_PACKAGE``).""" + + env = os.environ.get("RAE_AGENT_BROWSER_PACKAGE", "").strip() + if env: + return env + cfg = _config_manager_snapshot() + return str(cfg.get("agent_browser_npx_package") or "agent-browser@0") + + +def agent_browser_extra_notes() -> str: + """Optional user guidance from ``agent_browser_notes``.""" + + txt = (_config_manager_snapshot().get("agent_browser_notes") or "").strip() + if txt: + return txt + return os.environ.get("RAE_AGENT_BROWSER_NOTES", "").strip() + + +def _probe_help_argv(argv_without_help: list[str]) -> str | None: + try: + proc = subprocess.run( + [*argv_without_help, "--help"], + capture_output=True, + text=True, + timeout=240, + check=False, + ) + except subprocess.TimeoutExpired: + return f"timed out running `{shlex.join(argv_without_help)} --help`." + except OSError as e: + return f"failed subprocess `{shlex.join(argv_without_help)}`: {e}" + + stdout = (proc.stdout or "").strip() + stderr = (proc.stderr or "").strip() + if proc.returncode == 0: + return None + blob = stderr or stdout or "(no output)" + blob = blob[:1600] + return ( + f"`{shlex.join(argv_without_help)} --help` failed with exit " + f"{proc.returncode}. Output (truncated): {blob}" + ) + + +def agent_browser_shell_invoker() -> str: + """Shell command prefix embedded in prompts (requires prior successful ``ensure_*`` unless testing).""" + + global _SHELL_INVOKER + if _SHELL_INVOKER is not None: + return _SHELL_INVOKER + pkg = agent_browser_npx_package() + return shlex.join(["npx", "-y", pkg]) + + +def _finalize_npx_invoker(notices: list[str]) -> AgentBrowserSetup: + global _SHELL_INVOKER + + if shutil.which("npx") is None: + return AgentBrowserSetup( + error=( + "npx not found on PATH. Install agent-browser globally " + "`npm install -g agent-browser`, or fix PATH so npm's global bin is visible." + ), + notices=tuple(notices), + ) + + pkg = agent_browser_npx_package() + err = _probe_help_argv(["npx", "-y", pkg]) + if err: + hint = "`RAE_AGENT_BROWSER_PACKAGE` or config `agent_browser_npx_package` can pin/coerce versions." + return AgentBrowserSetup( + error=f"{err} Adjust {hint}", + notices=tuple(notices), + ) + + inv = shlex.join(["npx", "-y", pkg]) + _SHELL_INVOKER = inv + return AgentBrowserSetup(notices=tuple(notices)) + + +def ensure_agent_browser_runtime() -> AgentBrowserSetup: + """Resolve ``agent-browser`` on PATH or install globally, else fall back to ``npx -y``. + + Successful runs stash the shell snippet for ``agent_browser_prompt_fields``. + """ + + global _SHELL_INVOKER + notices: list[str] = [] + _SHELL_INVOKER = None + + if shutil.which("agent-browser"): + err = _probe_help_argv(["agent-browser"]) + if err: + return AgentBrowserSetup(error=err, notices=tuple(notices)) + _SHELL_INVOKER = "agent-browser" + return AgentBrowserSetup(notices=tuple(notices)) + + if shutil.which("node") is None: + return AgentBrowserSetup(error="node not found in PATH (needed to install/run agent-browser).") + + npm = shutil.which("npm") + pkg = agent_browser_npx_package() + if npm is None: + notices.append("npm was not found on PATH; falling back to `npx -y …` instead of global install.") + return _finalize_npx_invoker(notices) + + try: + proc = subprocess.run( + [npm, "install", "-g", pkg], + capture_output=True, + text=True, + timeout=600, + check=False, + ) + except subprocess.TimeoutExpired: + notices.append("`npm install -g …` timed out; falling back to `npx -y …` for this session.") + return _finalize_npx_invoker(notices) + except OSError as e: + notices.append(f"Could not spawn npm globally ({e}); falling back to `npx -y …`.") + return _finalize_npx_invoker(notices) + + if proc.returncode != 0: + blob = (proc.stderr or proc.stdout or "").strip() + notices.append( + "`npm install -g …` returned a non-zero status (showing truncated output " + f"below); falling back to `npx -y …`. Output: {(blob[:800] + '…') if len(blob) > 800 else blob or '(empty)'}" + ) + return _finalize_npx_invoker(notices) + + notices.append( + f'Installed upstream agent-browser with `npm install -g {pkg}`. ' + 'If Chromium is missing, run `agent-browser install` once (add `--with-deps` on trimmed Linux VMs). ' + 'Reuse this global install across runs for consistent browser pairing.' + ) + + if not shutil.which("agent-browser"): + return AgentBrowserSetup( + error=( + "Installed via npm but `agent-browser` is still not on PATH. " + "Append `npm bin -g` to PATH or reinstall with a toolchain that exposes the global shim." + ), + notices=tuple(notices), + ) + + err = _probe_help_argv(["agent-browser"]) + if err: + return AgentBrowserSetup(error=err, notices=tuple(notices)) + + _SHELL_INVOKER = "agent-browser" + return AgentBrowserSetup(notices=tuple(notices)) + + +def allowed_tools_agent_browser_agent_mode() -> list[str]: + """Tool allow-list for Claude SDK runs when browsing is delegated to agent-browser.""" + + return sorted(_AGENT_BROWSER_TOOLS) + + +def agent_browser_prompt_fields(*, run_id: str, headless: bool) -> dict[str, str]: + """Variables for ``prompts/auto/user_agent_browser.md``.""" + + shell = agent_browser_shell_invoker() + session = f"rae-{run_id}" + headed = ( + "" + if headless + else "Use the global `--headed` flag on subcommands that show a window when you need " + "a visible browser (local debugging only).\n\n" + ) + notes = agent_browser_extra_notes() + notes_block = "" + if notes: + notes_block = f"\n## Extra operator notes (from config or RAE_AGENT_BROWSER_NOTES)\n\n{notes}\n" + return { + "agent_browser_shell": shell, + "agent_browser_npx_package": agent_browser_npx_package(), + "agent_browser_session": session, + "agent_browser_headed_hint": headed, + "agent_browser_notes_block": notes_block, + } diff --git a/src/reverse_api/auto_engineer.py b/src/reverse_api/auto_engineer.py index 05191f8..61ffb60 100644 --- a/src/reverse_api/auto_engineer.py +++ b/src/reverse_api/auto_engineer.py @@ -1,6 +1,7 @@ -"""Auto mode engineers: LLM-controlled browser automation with real-time reverse engineering. +"""Auto mode engineers: LLM-controlled browsing with real-time reverse engineering. -Combines browser automation via MCP with simultaneous API reverse engineering. +Providers **auto** and **chrome-mcp** attach a browser MCP server to the SDK. Provider +**agent-browser** shells the upstream Vercel ``agent-browser`` CLI (auto-install via npm when missing, validated with ``--help``) instead of attaching browser MCP here. """ import asyncio @@ -15,6 +16,12 @@ ToolPermissionContext, ) +from .agent_browser import ( + agent_browser_prompt_fields, + allowed_tools_agent_browser_agent_mode, + ensure_agent_browser_runtime, + print_agent_browser_setup_notices, +) from .engineer import ClaudeEngineer from .opencode_engineer import OpenCodeEngineer, debug_log, format_error from .utils import get_har_dir @@ -25,7 +32,7 @@ class ClaudeAutoEngineer(ClaudeEngineer): - """Auto mode using Claude SDK: LLM controls browser via MCP while reverse engineering.""" + """Auto mode using Claude SDK: LLM-led browsing plus reverse-engineering codegen.""" def __init__( self, @@ -36,9 +43,9 @@ def __init__( agent_provider: str = "auto", **kwargs, ): - """Initialize auto engineer with expected HAR path (created by MCP).""" - # `headless` is auto-engineer specific (controls the MCP-spawned browser), - # not BaseEngineer concept; pop before super() to avoid an unknown kwarg. + """Initialize Claude-backed agent engineer (HAR path derives from ``run_id``).""" + # `headless` is auto-engineer specific: for MCP providers it configures the MCP + # server's browser launch; for `agent-browser` it only adjusts prompt wording. headless = kwargs.pop("headless", False) har_dir = get_har_dir(run_id, output_dir) har_path = har_dir / "recording.har" @@ -73,7 +80,7 @@ def _build_auto_prompts(self) -> tuple[str, str]: browser_tool_label = ( "Chrome DevTools MCP" if self.agent_provider == "chrome-mcp" - else "MCP" + else ("Vercel agent-browser (shell CLI)" if self.agent_provider == "agent-browser" else "MCP") ) system_prompt = load( @@ -84,11 +91,12 @@ def _build_auto_prompts(self) -> tuple[str, str]: output_files=output_files, ) - template = ( - "auto/user_chrome_mcp" - if self.agent_provider == "chrome-mcp" - else "auto/user_playwright" - ) + if self.agent_provider == "chrome-mcp": + template = "auto/user_chrome_mcp" + elif self.agent_provider == "agent-browser": + template = "auto/user_agent_browser" + else: + template = "auto/user_playwright" template_kwargs = { "prompt": self.prompt, @@ -96,6 +104,10 @@ def _build_auto_prompts(self) -> tuple[str, str]: } if self.agent_provider != "chrome-mcp": template_kwargs["har_path"] = str(self.har_path) + if self.agent_provider == "agent-browser": + template_kwargs.update( + agent_browser_prompt_fields(run_id=self.mcp_run_id, headless=self.headless), + ) user_message = load(template, **template_kwargs) return system_prompt, user_message @@ -116,12 +128,16 @@ async def _handle_tool_permission(self, tool_name: str, input_data: dict[str, An return PermissionResultAllow(updated_input=input_data) def _get_mcp_config(self) -> tuple[str, dict]: - """Return (server_name, mcp_config) based on agent_provider. + """Return ``(server_name, mcp_config)`` for Playwright or Chrome MCP providers. - Auto-connect requires a real headed Chrome instance with a remote - debugging server, so it is dropped in headless mode and the MCP - spawns its own headless Chromium instead. + Not applicable to ``agent-browser`` (calling this raises). Auto-connect Chrome + requires a headed instance with remote debugging unless ``headless`` is set. """ + if self.agent_provider == "agent-browser": + raise RuntimeError( + "agent-browser uses the Vercel agent-browser CLI from the shell, not a " + "registered browser MCP server (this helper only builds configs for MCP providers)" + ) if self.agent_provider == "chrome-mcp": args = ["chrome-devtools-mcp@latest", "--no-usage-statistics"] if self.headless: @@ -148,28 +164,48 @@ def _get_mcp_config(self) -> tuple[str, dict]: } async def analyze_and_generate(self) -> dict[str, Any] | None: - """Run auto mode with MCP browser integration. + """Run agent mode with browser automation appropriate to ``agent_provider``. Reuses _process_streaming_response and follow-up loop from ClaudeEngineer. """ self.ui.header(self.run_id, self.prompt, self.model, mode="agent") self.ui.start_analysis() + if self.agent_provider == "agent-browser": + ab_setup = ensure_agent_browser_runtime() + print_agent_browser_setup_notices(self.ui.console, ab_setup) + if not ab_setup.ok: + self.ui.error(ab_setup.error or "agent-browser setup failed") + self.message_store.save_error(ab_setup.error or "agent-browser setup failed") + return None + system_prompt, user_message = self._get_active_prompts() self.message_store.save_prompt(user_message) - mcp_name, mcp_config = self._get_mcp_config() - - options = ClaudeAgentOptions( - system_prompt=system_prompt, - mcp_servers={mcp_name: mcp_config}, - permission_mode="bypassPermissions", - can_use_tool=self._handle_tool_permission, - cwd=str(self.scripts_dir.parent.parent), - model=self.model, - env={"CLAUDECODE": ""}, - stderr=self._handle_cli_stderr, - ) + if self.agent_provider == "agent-browser": + options = ClaudeAgentOptions( + system_prompt=system_prompt, + mcp_servers={}, + allowed_tools=allowed_tools_agent_browser_agent_mode(), + permission_mode="bypassPermissions", + can_use_tool=self._handle_tool_permission, + cwd=str(self.scripts_dir.parent.parent), + model=self.model, + env={"CLAUDECODE": ""}, + stderr=self._handle_cli_stderr, + ) + else: + mcp_name, mcp_config = self._get_mcp_config() + options = ClaudeAgentOptions( + system_prompt=system_prompt, + mcp_servers={mcp_name: mcp_config}, + permission_mode="bypassPermissions", + can_use_tool=self._handle_tool_permission, + cwd=str(self.scripts_dir.parent.parent), + model=self.model, + env={"CLAUDECODE": ""}, + stderr=self._handle_cli_stderr, + ) last_result: dict[str, Any] | None = None @@ -215,6 +251,11 @@ async def analyze_and_generate(self) -> dict[str, Any] | None: if self.agent_provider == "chrome-mcp": self.ui.console.print("\n[dim]Make sure chrome-devtools-mcp is available: npx chrome-devtools-mcp@latest[/dim]") self.ui.console.print("[dim]Chrome 146+ required with auto-connect enabled at chrome://inspect/#remote-debugging[/dim]") + elif self.agent_provider == "agent-browser": + self.ui.console.print( + "\n[dim]agent-browser: ensure `agent-browser` is global (`npm install -g`) or pinned " + "via agent_browser_npx_package / RAE_AGENT_BROWSER_PACKAGE, and rerun `reverse-api-engineer`.[/dim]" + ) else: self.ui.console.print("\n[dim]Make sure rae-playwright-mcp is installed: npm install -g rae-playwright-mcp[/dim]") else: @@ -223,10 +264,10 @@ async def analyze_and_generate(self) -> dict[str, Any] | None: class OpenCodeAutoEngineer(OpenCodeEngineer): - """Auto mode using OpenCode SDK: Register MCP server dynamically.""" + """Agent mode via OpenCode: registers a browser MCP server when the provider uses MCP.""" def __init__(self, run_id: str, prompt: str, output_dir: str | None = None, agent_provider: str = "auto", **kwargs): - """Initialize auto engineer with expected HAR path (created by MCP).""" + """Initialize OpenCode-backed agent engineer.""" headless = kwargs.pop("headless", False) har_dir = get_har_dir(run_id, output_dir) har_path = har_dir / "recording.har" @@ -246,13 +287,18 @@ def __init__(self, run_id: str, prompt: str, output_dir: str | None = None, agen def _get_active_prompts(self) -> tuple[str, str]: return ClaudeAutoEngineer._build_auto_prompts(self) - def _get_opencode_mcp_config(self) -> dict: + def _get_opencode_mcp_config(self) -> dict | None: """Return OpenCode MCP registration payload based on agent_provider. Auto-connect requires a headed Chrome with a remote debugging server, so it is dropped in headless mode in favor of an MCP-spawned headless Chromium. + + agent-browser skips MCP—the model shells the upstream CLI directly. """ + if self.agent_provider == "agent-browser": + self.mcp_name = None + return None if self.agent_provider == "chrome-mcp": self.mcp_name = f"chrome-devtools-{self._session_id}" cmd = ["npx", "-y", "chrome-devtools-mcp@latest", "--no-usage-statistics"] @@ -291,10 +337,19 @@ def _get_opencode_mcp_config(self) -> dict: } async def analyze_and_generate(self) -> dict[str, Any] | None: - """Run auto mode with OpenCode MCP integration.""" + """Run agent mode via OpenCode (browser MCP registration only for MCP-backed providers).""" self.opencode_ui.header(self.run_id, self.prompt, self.opencode_model, mode="agent") self.opencode_ui.start_analysis() + if self.agent_provider == "agent-browser": + ab_setup = ensure_agent_browser_runtime() + print_agent_browser_setup_notices(self.opencode_ui.console, ab_setup) + if not ab_setup.ok: + msg = ab_setup.error or "agent-browser setup failed" + self.opencode_ui.error(msg) + self.message_store.save_error(msg) + return None + system_prompt, user_message = self._get_active_prompts() active_prompt = f"{system_prompt}\n\n{user_message}" self.message_store.save_prompt(user_message) @@ -331,14 +386,15 @@ async def analyze_and_generate(self) -> dict[str, Any] | None: mcp_config = self._get_opencode_mcp_config() - try: - debug_log(f"Registering MCP server: {self.mcp_name}") - mcp_r = await client.post("/mcp", json=mcp_config) - mcp_r.raise_for_status() - debug_log("MCP server registered successfully") - except Exception as e: - self.opencode_ui.error(f"Failed to register MCP server: {e}") - return None + if mcp_config is not None: + try: + debug_log(f"Registering MCP server: {self.mcp_name}") + mcp_r = await client.post("/mcp", json=mcp_config) + mcp_r.raise_for_status() + debug_log("MCP server registered successfully") + except Exception as e: + self.opencode_ui.error(f"Failed to register MCP server: {e}") + return None # Start event stream BEFORE sending message event_task = asyncio.create_task(self._stream_events(client)) @@ -369,13 +425,13 @@ async def analyze_and_generate(self) -> dict[str, Any] | None: self.opencode_ui.stop_streaming() # Deregister MCP server - try: - if self.mcp_name: + if self.mcp_name: + try: debug_log(f"Deregistering MCP server: {self.mcp_name}") await client.delete(f"/mcp/{self.mcp_name}") debug_log("MCP server deregistered") - except Exception as e: - debug_log(f"Failed to deregister MCP server: {e}") + except Exception as e: + debug_log(f"Failed to deregister MCP server: {e}") # Check for errors if self._last_error: @@ -465,10 +521,11 @@ async def analyze_and_generate(self) -> dict[str, Any] | None: class CopilotAutoEngineer: - """Auto mode using Copilot SDK: LLM controls browser via MCP while reverse engineering. + """Agent mode via Copilot SDK. - Uses composition rather than inheritance since CopilotEngineer requires lazy imports. - Delegates to CopilotEngineer for the core logic and adds MCP browser integration. + Delegates to ``CopilotEngineer`` while wiring browser tooling: MCP for ``auto`` / + ``chrome-mcp``, validated ``agent-browser`` CLI bootstrap for ``agent-browser``. + Uses composition because ``CopilotEngineer`` relies on lazy imports. """ def __init__( @@ -505,7 +562,7 @@ def stop_sync(self) -> None: self._engineer.stop_sync() async def analyze_and_generate(self) -> dict[str, Any] | None: - """Run auto mode with Copilot SDK and MCP browser integration.""" + """Run agent mode with Copilot SDK (MCP browsers or agent-browser CLI per provider).""" try: from copilot import CopilotClient, PermissionHandler except ImportError: @@ -559,21 +616,30 @@ def on_event(event: Any) -> None: await client.start() if self.agent_provider == "chrome-mcp": - mcp_server_name = "chrome-devtools" chrome_args = ["-y", "chrome-devtools-mcp@latest", "--no-usage-statistics"] if self.headless: chrome_args.append("--headless") else: chrome_args.append("--autoConnect") - mcp_config = { - "type": "local", - "command": "npx", - "args": chrome_args, - "tools": ["*"], - "timeout": 30000, + mcp_servers_payload: dict[str, Any] = { + "chrome-devtools": { + "type": "local", + "command": "npx", + "args": chrome_args, + "tools": ["*"], + "timeout": 30000, + }, } + elif self.agent_provider == "agent-browser": + ab_setup = ensure_agent_browser_runtime() + print_agent_browser_setup_notices(eng.ui.console, ab_setup) + if not ab_setup.ok: + err = ab_setup.error or "agent-browser setup failed" + eng.ui.error(err) + eng.message_store.save_error(err) + return None + mcp_servers_payload = {} else: - mcp_server_name = "playwright" pw_args = [ "-y", "rae-playwright-mcp@latest", @@ -583,12 +649,14 @@ def on_event(event: Any) -> None: ] if self.headless: pw_args.append("--headless") - mcp_config = { - "type": "local", - "command": "npx", - "args": pw_args, - "tools": ["*"], - "timeout": 30000, + mcp_servers_payload = { + "playwright": { + "type": "local", + "command": "npx", + "args": pw_args, + "tools": ["*"], + "timeout": 30000, + }, } async def on_pre_tool_use(input: dict, _invocation: dict) -> dict: @@ -611,7 +679,7 @@ async def on_post_tool_use(input: dict, invocation: dict) -> dict: "model": eng.copilot_model, "streaming": True, "infinite_sessions": {"enabled": True}, - "mcp_servers": {mcp_server_name: mcp_config}, + "mcp_servers": mcp_servers_payload, "on_permission_request": PermissionHandler.approve_all, "hooks": { "on_pre_tool_use": on_pre_tool_use, diff --git a/src/reverse_api/cli.py b/src/reverse_api/cli.py index e769d9d..75fd4e4 100644 --- a/src/reverse_api/cli.py +++ b/src/reverse_api/cli.py @@ -209,6 +209,8 @@ def _build_dry_run_payload( import shutil import subprocess + from .agent_browser import ensure_agent_browser_runtime + checks: list[dict] = [] # 1. Prompt @@ -232,13 +234,13 @@ def _build_dry_run_payload( # 3. Agent provider agent_provider = config_manager.get("agent_provider", "auto") - if agent_provider in ("auto", "chrome-mcp"): + if agent_provider in ("auto", "chrome-mcp", "agent-browser"): checks.append({"name": "agent_provider", "status": "ok", "message": agent_provider}) else: checks.append({ "name": "agent_provider", "status": "error", - "message": f"unknown agent_provider {agent_provider!r}; expected 'auto' or 'chrome-mcp'", + "message": f"unknown agent_provider {agent_provider!r}; expected 'auto', 'chrome-mcp', or 'agent-browser'", }) # 4. SDK + API key presence (we only check env var existence, not validity) @@ -264,15 +266,15 @@ def _build_dry_run_payload( else: checks.append({"name": f"sdk:{sdk}", "status": "ok", "message": f"{sdk_env_var} present"}) - # 5. Node.js + npx for MCP servers (both auto and chrome-mcp shell out to - # `npx `; minimal Docker images sometimes ship `node` without - # `npx`, so checking only `node` would lull dry-run into a false ok). + # 5. Node.js + npx — browser MCP subprocesses (`auto`, `chrome-mcp`) and the + # agent-browser CLI (`npx -y …`) all require Node + npx in PATH. node = shutil.which("node") if node is None: checks.append({ "name": "node", "status": "error", - "message": "node not found in PATH; required by both auto (rae-playwright-mcp) and chrome-mcp", + "message": "node not found in PATH; required for agent-mode browser tooling " + "(MCP browser servers plus Vercel agent-browser via npx)", }) else: try: @@ -288,7 +290,8 @@ def _build_dry_run_payload( checks.append({ "name": "npx", "status": "error", - "message": "npx not found in PATH; both MCP servers are launched via `npx `", + "message": "npx not found in PATH; required so npm can spawn MCP helpers and " + "fetch/run the pinned agent-browser CLI on demand", }) else: checks.append({"name": "npx", "status": "ok", "message": npx}) @@ -301,7 +304,23 @@ def _build_dry_run_payload( "message": "chrome-mcp without --headless requires Chrome 146+ with auto-connect enabled at chrome://inspect/#remote-debugging — this is not auto-checkable", }) - # 7. Output dir writability — probe with a unique filename so we never + # 7. agent-browser CLI: PATH binary preferred; npm global install fallback. + if agent_provider == "agent-browser": + ab_setup = ensure_agent_browser_runtime() + parts: list[str] = [] + if ab_setup.error: + parts.append(ab_setup.error) + elif ab_setup.ok: + parts.append("`agent-browser` CLI reachable (`--help` OK)") + if ab_setup.notices: + parts.extend(ab_setup.notices) + checks.append({ + "name": "agent-browser:cli", + "status": "error" if not ab_setup.ok else "ok", + "message": " | ".join(parts) if parts else "configured", + }) + + # 8. Output dir writability — probe with a unique filename so we never # clobber a real user file (a fixed name like `.dry_run_write_probe` # could legitimately exist in someone's output dir). import secrets @@ -1009,6 +1028,7 @@ def handle_settings(mode_color=THEME_PRIMARY): provider_choices = [ Choice(title="auto (Playwright MCP)", value="auto"), Choice(title="chrome-mcp (Chrome DevTools MCP)", value="chrome-mcp"), + Choice(title="agent-browser (Vercel CLI via shell/Bash)", value="agent-browser"), Choice(title="back", value="back"), ] provider = questionary.select( @@ -1026,7 +1046,27 @@ def handle_settings(mode_color=THEME_PRIMARY): if provider and provider != "back": config_manager.set("agent_provider", provider) console.print(f" [dim]updated[/dim] agent provider: {provider}") - if provider == "chrome-mcp": + if provider == "agent-browser": + console.print() + console.print( + " [dim]Browsing runs through Vercel's agent-browser CLI (not an MCP shim).[/dim]" + ) + console.print( + " [dim]On first agent run RAE installs the CLI with `npm install -g ` when " + "`agent-browser` is missing (you’ll see a banner), then reuses the global shim.[/dim]" + ) + console.print( + " [dim]Pin via `agent_browser_npx_package` in config or `RAE_AGENT_BROWSER_PACKAGE` env.[/dim]" + ) + console.print( + " [dim]Extra operator hints (cloud backends, corp proxy…): " + "`agent_browser_notes` or `RAE_AGENT_BROWSER_NOTES`.[/dim]" + ) + console.print( + " [dim]Chrome download: `agent-browser install` once (add `--with-deps` on Linux).[/dim]" + ) + console.print() + elif provider == "chrome-mcp": console.print() console.print(" [dim]chrome devtools mcp setup:[/dim]") console.print(" [dim] 1. open chrome (version 146 or newer)[/dim]") @@ -1422,7 +1462,7 @@ def manual(prompt, url, reverse_engineer, model, output_dir): "run_id": "" | null, "prompt": "...", "url": "..." | null, - "mode": "auto" | "chrome-mcp" | null, + "mode": "auto" | "chrome-mcp" | "agent-browser" | null, "har_path": "/abs/path/recording.har" | null, "script_path": "/abs/path/api_client.py" | null, "usage": { "input_tokens": ..., "output_tokens": ..., "total_cost": ... }, @@ -1726,6 +1766,22 @@ def run_auto_capture( console.print(" [dim]auto-connect disabled — mcp will spawn its own headless chrome[/dim]") console.print() + elif agent_provider == "agent-browser": + console.print() + console.print( + " [dim]agent-browser: Vercel CLI via shell (Reverse API Engineer does not attach " + "Playwright/Chrome browser MCP for this provider).[/dim]" + ) + console.print( + " [dim]Startup prefers PATH `agent-browser`, otherwise installs via " + "`npm install -g ` so Chromium pairs with a stable shim (with a banner); " + "only falls back to `npx -y` if npm cannot run.[/dim]" + ) + console.print( + " [dim]First-time Chrome download: `agent-browser install` (add `--with-deps` on Linux).[/dim]" + ) + console.print() + sdk = config_manager.get("sdk", "claude") if model is None: model = default_model_for_configured_sdk(sdk) @@ -1733,7 +1789,12 @@ def run_auto_capture( run_id = generate_run_id() timestamp = get_timestamp() - mode_label = "chrome-mcp" if agent_provider == "chrome-mcp" else "auto" + if agent_provider == "chrome-mcp": + mode_label = "chrome-mcp" + elif agent_provider == "agent-browser": + mode_label = "agent-browser" + else: + mode_label = "auto" session_manager.add_run( run_id=run_id, prompt=prompt, diff --git a/src/reverse_api/config.py b/src/reverse_api/config.py index 90c108c..aed033c 100644 --- a/src/reverse_api/config.py +++ b/src/reverse_api/config.py @@ -5,7 +5,9 @@ from typing import Any DEFAULT_CONFIG = { - "agent_provider": "auto", # "auto" (Playwright MCP) or "chrome-mcp" (Chrome DevTools MCP) + "agent_provider": "auto", # "auto" | "chrome-mcp" | "agent-browser" + "agent_browser_notes": "", # extra instructions merged into agent-browser prompt / RAE_AGENT_BROWSER_NOTES env + "agent_browser_npx_package": "agent-browser@0", "claude_code_model": "claude-sonnet-4-6", "collector_model": "claude-sonnet-4-6", # Model for collector mode "cursor_model": "composer-2", # Model id for Cursor SDK (see Cursor.models.list()) diff --git a/src/reverse_api/cursor_engineer.py b/src/reverse_api/cursor_engineer.py index 5e4e481..d7483ea 100644 --- a/src/reverse_api/cursor_engineer.py +++ b/src/reverse_api/cursor_engineer.py @@ -10,6 +10,7 @@ from pathlib import Path from typing import Any +from .agent_browser import ensure_agent_browser_runtime, print_agent_browser_setup_notices from .base_engineer import BaseEngineer from .tui import ClaudeUI @@ -406,7 +407,7 @@ async def analyze_and_generate(self) -> dict[str, Any] | None: class CursorAutoEngineer(CursorEngineer): - """Agent + capture using Cursor SDK with MCP browser servers.""" + """Agent capture using Cursor SDK—browser via MCP for auto/chrome-mcp or Vercel agent-browser CLI prompts for agent-browser.""" def __init__( self, @@ -434,6 +435,8 @@ def __init__( self.headless = headless def _cursor_mcp_servers(self) -> dict[str, Any]: + if self.agent_provider == "agent-browser": + return {} if self.agent_provider == "chrome-mcp": args = ["-y", "chrome-devtools-mcp@latest", "--no-usage-statistics"] if self.headless: @@ -483,6 +486,15 @@ async def analyze_and_generate(self) -> dict[str, Any] | None: self.ui.console.print("\n[dim]Create an API key at https://cursor.com/dashboard/integrations[/dim]") return None + if self.agent_provider == "agent-browser": + ab_setup = ensure_agent_browser_runtime() + print_agent_browser_setup_notices(self.ui.console, ab_setup) + if not ab_setup.ok: + err = ab_setup.error or "agent-browser setup failed" + self.ui.error(err) + self.message_store.save_error(err) + return None + system_prompt, user_message = ClaudeAutoEngineer._build_auto_prompts(self) self.message_store.save_prompt(user_message) combined = f"{system_prompt}\n\n{user_message}" diff --git a/src/reverse_api/prompts/auto/system.md b/src/reverse_api/prompts/auto/system.md index 34cba02..fe7e028 100644 --- a/src/reverse_api/prompts/auto/system.md +++ b/src/reverse_api/prompts/auto/system.md @@ -1,5 +1,9 @@ -You are an autonomous AI agent with browser control via {browser_tool_label} tools. -Your mission is to browse, monitor network traffic, and generate production-ready {language_name} API code. +You are an autonomous AI agent that automates browsing and captures network traffic for reverse engineering. + +For this provider your integration surface is: + +**{browser_tool_label}.** + **Core principle:** The generated scripts must work immediately with zero user effort. Hardcode all credentials, cookies, tokens, and session data you discover. No env vars, no config files, no manual setup required. If you observe a token refresh, cookie renewal, or login flow in the traffic, implement automatic re-authentication so the script doesn't go stale. diff --git a/src/reverse_api/prompts/auto/user_agent_browser.md b/src/reverse_api/prompts/auto/user_agent_browser.md new file mode 100644 index 0000000..ff98725 --- /dev/null +++ b/src/reverse_api/prompts/auto/user_agent_browser.md @@ -0,0 +1,73 @@ + +{prompt} + + + +{scripts_dir} + + + +## Mandatory tooling + +Reverse engineering relies on **`recording.har`** at **`{har_path}`**. You MUST drive browsing **only through the Vercel [agent-browser](https://github.com/vercel-labs/agent-browser) CLI** invoked from shell commands (terminal tools such as Bash). In this provider Reverse API Engineer does **not** register browser automation over MCP; the model uses the CLI exclusively. + +### Host bootstrap (runs before the streaming session) + +Reverse API Engineer prefers an **`agent-browser` binary already on PATH** and validates it with **`{agent_browser_shell} --help`**. When nothing is installed it runs **`npm install -g {agent_browser_npx_package}`** (pin from config or **`RAE_AGENT_BROWSER_PACKAGE`**), notifies you, then re-checks PATH. Only if npm/global installs cannot run does it transparently reuse **`{agent_browser_shell}`** wired through **`npx -y`** for that session—which can mean extra registry traffic on cold machines. + +Warm up upstream context locally once you start: + +```bash +export AGENT_BROWSER_SESSION="{agent_browser_session}" +{agent_browser_shell} skills get core --full +``` + +Treat **`skills get core --full`** as mandatory for default/local flows. Confirm it exits cleanly; if commands error, run **`skills list`** (per upstream), pick the documented bundle (`core` ships with upstream), or escalate with **`{agent_browser_shell} doctor`**. + +{agent_browser_headed_hint}### Session + package + +- Stable session env: **`AGENT_BROWSER_SESSION={agent_browser_session}`** before every invocation (isolates refs/HAR for this run). +- Invocation prefix (**exactly how the host resolved the CLI):** **`{agent_browser_shell}`**. Pin for installs is **`{agent_browser_npx_package}`** (config / **`RAE_AGENT_BROWSER_PACKAGE`**). + +### Cloud / remote browsers + +Upstream documents SaaS-hosted and remote backends. When the operator hints at cloud targets (Bedrock AgentCore, Vercel Sandbox, …), run **`skills list`** first so you fetch bundles that ship with **this CLI copy**, **`skills get `**, then adopt workflow files inside—the flags stay paired with **`{agent_browser_shell}`**. +{agent_browser_notes_block} + +## Workflow + +Interaction model identical to upstream docs: **`snapshot`** for `@eN` refs → **`click` / `fill` / …** → **`snapshot`** after navigation. + + +### Phase 1: BROWSE + +Use shell commands shaped like: + +```bash +export AGENT_BROWSER_SESSION="{agent_browser_session}" +{agent_browser_shell} network har start +{agent_browser_shell} open https://example.com +{agent_browser_shell} snapshot -i --json +# … iterate … +``` + + +### Phase 2: MONITOR + +Use **`network requests --json`** (with filters when noisy) plus occasional snapshots. + +### Phase 3: CAPTURE → `recording.har` + +Before reverse engineering MUST flush HAR to the canonical file **exact path** below (create parent dirs if needed): + +```bash +export AGENT_BROWSER_SESSION="{agent_browser_session}" +{agent_browser_shell} network har stop {har_path} +{agent_browser_shell} close +``` + +### Phase 4: REVERSE ENGINEER + +Read **`{har_path}`** and emit code under **`{scripts_dir}`** per the system prompt. + +**VPS tips:** first-time hosts run `{agent_browser_shell} install` (add `--with-deps` on Linux). `doctor` diagnoses missing Chrome or permissions. diff --git a/tests/test_agent_browser.py b/tests/test_agent_browser.py new file mode 100644 index 0000000..0983e86 --- /dev/null +++ b/tests/test_agent_browser.py @@ -0,0 +1,128 @@ +"""Tests for reverse_api/agent_browser helpers.""" + +from __future__ import annotations + +from unittest.mock import MagicMock, patch + +import pytest + +from reverse_api import agent_browser + + +@pytest.fixture(autouse=True) +def clear_setup_cache(): + agent_browser.reset_agent_browser_setup_cache() + yield + agent_browser.reset_agent_browser_setup_cache() + + +def test_allowed_tools_contains_bash(): + tools = agent_browser.allowed_tools_agent_browser_agent_mode() + assert "Bash" in tools + + +def test_ensure_missing_node(): + with patch.object(agent_browser.shutil, "which", return_value=None): + st = agent_browser.ensure_agent_browser_runtime() + assert not st.ok + assert st.error and "node" in st.error.lower() + + +def test_global_binary_short_circuits_npm(): + help_ok = MagicMock(returncode=0, stderr="", stdout="ok") + + def which_side(name: str): + return "/fake/agent-browser" if name == "agent-browser" else None + + with patch.object(agent_browser.shutil, "which", side_effect=which_side): + with patch.object(agent_browser.subprocess, "run", return_value=help_ok) as run: + st = agent_browser.ensure_agent_browser_runtime() + assert st.ok + assert st.notices == () + assert agent_browser.agent_browser_shell_invoker() == "agent-browser" + run.assert_called_once() + argv = run.call_args[0][0] + assert argv == ["agent-browser", "--help"] + + +def test_global_help_failure(): + proc = MagicMock(returncode=1, stderr="broken", stdout="") + + def which_side(name: str): + return "/broken/ab" if name == "agent-browser" else None + + with patch.object(agent_browser.shutil, "which", side_effect=which_side): + with patch.object(agent_browser.subprocess, "run", return_value=proc): + st = agent_browser.ensure_agent_browser_runtime() + assert not st.ok + assert st.error and "failed" in st.error.lower() + + +def test_npm_global_install_then_help(): + help_ok = MagicMock(returncode=0, stderr="", stdout="usage") + state = {"have_ab": False} + + def which_fn(cmd: str) -> str | None: + if cmd == "agent-browser": + return "/usr/bin/agent-browser" if state["have_ab"] else None + if cmd == "node": + return "/usr/bin/node" + if cmd == "npm": + return "/usr/bin/npm" + return None + + def npm_then_help(argv, **_kwargs): + if argv and str(argv[0]).endswith("npm") and len(argv) >= 2 and argv[1] == "install": + state["have_ab"] = True + return MagicMock(returncode=0, stderr="", stdout="") + return help_ok + + with patch.object(agent_browser.shutil, "which", side_effect=which_fn): + with patch.object(agent_browser.subprocess, "run", side_effect=npm_then_help) as run: + with patch("reverse_api.agent_browser.agent_browser_npx_package", return_value="agent-browser@test"): + st = agent_browser.ensure_agent_browser_runtime() + assert st.ok + assert any("Installed upstream agent-browser" in n for n in st.notices) + assert agent_browser.agent_browser_shell_invoker() == "agent-browser" + + first_cmd = run.call_args_list[0][0][0] + assert first_cmd[:4] == ["/usr/bin/npm", "install", "-g", "agent-browser@test"] + + +def test_npm_missing_falls_back_npx(): + help_ok = MagicMock(returncode=0, stderr="", stdout="usage") + + def which_side(cmd: str) -> str | None: + if cmd == "agent-browser": + return None + if cmd == "node": + return "/bin/node" + if cmd == "npm": + return None + if cmd == "npx": + return "/bin/npx" + return None + + with patch.object(agent_browser.shutil, "which", side_effect=which_side): + with patch.object(agent_browser.subprocess, "run", return_value=help_ok): + with patch("reverse_api.agent_browser.agent_browser_npx_package", return_value="agent-browser@x"): + st = agent_browser.ensure_agent_browser_runtime() + assert st.ok + assert any("npm was not found" in n.lower() for n in st.notices) + assert agent_browser.agent_browser_shell_invoker() == "npx -y agent-browser@x" + + +def test_npx_package_env_overrides(monkeypatch: pytest.MonkeyPatch): + monkeypatch.setenv("RAE_AGENT_BROWSER_PACKAGE", "agent-browser@fixture") + assert agent_browser.agent_browser_npx_package() == "agent-browser@fixture" + + +def test_prompt_fields_includes_shell_and_notes(): + with patch("reverse_api.agent_browser.agent_browser_npx_package", return_value="pkg@x"): + with patch("reverse_api.agent_browser.agent_browser_extra_notes", return_value="cloud hint"): + with patch("reverse_api.agent_browser.agent_browser_shell_invoker", return_value="agent-browser"): + fields = agent_browser.agent_browser_prompt_fields(run_id="run1", headless=True) + assert fields["agent_browser_session"] == "rae-run1" + assert fields["agent_browser_npx_package"] == "pkg@x" + assert fields["agent_browser_shell"] == "agent-browser" + assert "cloud hint" in fields["agent_browser_notes_block"] diff --git a/tests/test_auto_engineer.py b/tests/test_auto_engineer.py index bd132a4..9764e47 100644 --- a/tests/test_auto_engineer.py +++ b/tests/test_auto_engineer.py @@ -1,5 +1,6 @@ """Tests for auto_engineer.py - Auto mode engineers.""" +from contextlib import contextmanager from dataclasses import dataclass from pathlib import Path from typing import Any @@ -238,6 +239,60 @@ def test_get_active_prompts_auto(self, tmp_path): assert "browser_navigate" in user_p +class TestAgentBrowserPrompt: + """Prompts for shell-driven agent-browser provider.""" + + @contextmanager + def _engineer_with_pkg_mock(self, tmp_path, pkg="agent-browser@test", **kwargs): + defaults = { + "run_id": "test123", + "prompt": "browse and capture", + "model": "claude-sonnet-4-6", + "output_dir": str(tmp_path), + "agent_provider": "agent-browser", + } + defaults.update(kwargs) + with patch("reverse_api.auto_engineer.get_har_dir", return_value=tmp_path / "har"): + with patch("reverse_api.base_engineer.get_scripts_dir", return_value=tmp_path / "scripts"): + with patch("reverse_api.base_engineer.MessageStore"): + with patch("reverse_api.agent_browser.agent_browser_npx_package", return_value=pkg): + yield ClaudeAutoEngineer(**defaults) + + def test_system_labels_shell_cli(self, tmp_path): + with self._engineer_with_pkg_mock(tmp_path) as eng: + sys_p, _user_p = eng._build_auto_prompts() + assert "shell CLI" in sys_p + + def test_user_prompt_uses_ag_cli_session_and_har(self, tmp_path): + with self._engineer_with_pkg_mock(tmp_path) as eng: + _sys_p, user_p = eng._build_auto_prompts() + assert "skills get core" in user_p + assert "AGENT_BROWSER_SESSION" in user_p + assert "npx -y agent-browser@test" in user_p + assert str(eng.har_path) in user_p + + def test_opencode_delegates_prompts(self, tmp_path): + with patch("reverse_api.auto_engineer.get_har_dir", return_value=tmp_path / "har"): + with patch("reverse_api.base_engineer.get_scripts_dir", return_value=tmp_path / "scripts"): + with patch("reverse_api.base_engineer.MessageStore"): + with patch("reverse_api.opencode_engineer.OpenCodeUI"): + with patch( + "reverse_api.agent_browser.agent_browser_npx_package", + return_value="agent-browser@test", + ): + eng = OpenCodeAutoEngineer( + run_id="test123", + prompt="browse and capture", + output_dir=str(tmp_path), + agent_provider="agent-browser", + opencode_provider="anthropic", + opencode_model="claude-opus-4-6", + ) + sys_p, user_p = eng._get_active_prompts() + assert "shell CLI" in sys_p + assert "npx" in user_p + + class TestChromeMcpConfig: """Test MCP configuration selection.""" @@ -270,6 +325,12 @@ def test_auto_mcp_config(self, tmp_path): assert "rae-playwright-mcp@latest" in config["args"] assert "--run-id" in config["args"] + def test_agent_browser_no_mcp_config_path(self, tmp_path): + """`_get_mcp_config` is only for MCP-backed providers.""" + eng = self._make_engineer(tmp_path, agent_provider="agent-browser") + with pytest.raises(RuntimeError, match="agent-browser uses the Vercel agent-browser CLI"): + eng._get_mcp_config() + class TestOpenCodeChromeMcpConfig: """Test OpenCodeAutoEngineer Chrome MCP config.""" @@ -305,6 +366,12 @@ def test_opencode_auto_mcp_config(self, tmp_path): assert "playwright" in eng.mcp_name assert "rae-playwright-mcp@latest" in config["config"]["command"] + def test_opencode_agent_browser_skips_mcp(self, tmp_path): + """CLI-only agent-browser mode does not register OpenCode MCP.""" + eng = self._make_engineer(tmp_path, agent_provider="agent-browser") + assert eng._get_opencode_mcp_config() is None + assert eng.mcp_name is None + def test_opencode_chrome_mcp_prompt(self, tmp_path): """OpenCode chrome-mcp uses Chrome DevTools prompt.""" eng = self._make_engineer(tmp_path, agent_provider="chrome-mcp") diff --git a/tests/test_config.py b/tests/test_config.py index 47fa160..6a9c6e8 100644 --- a/tests/test_config.py +++ b/tests/test_config.py @@ -143,6 +143,8 @@ def test_has_required_keys(self): """DEFAULT_CONFIG contains all expected keys.""" expected_keys = { "agent_provider", + "agent_browser_notes", + "agent_browser_npx_package", "claude_code_model", "collector_model", "copilot_model", diff --git a/website/content/docs/cli/scripted-usage.mdx b/website/content/docs/cli/scripted-usage.mdx index c524d06..79b4838 100644 --- a/website/content/docs/cli/scripted-usage.mdx +++ b/website/content/docs/cli/scripted-usage.mdx @@ -55,7 +55,7 @@ reverse-api-engineer run --file api_client.py \ | `run_id` | `string \| null` | Stable id for follow-up `show` / `engineer` / `run` calls. | | `prompt` | `string` | The prompt passed in. | | `url` | `string \| null` | Optional starting URL. | -| `mode` | `string \| null` | Provider used (`"auto"` for Playwright MCP, `"chrome-mcp"`). | +| `mode` | `string \| null` | Provider (`"auto"`, `"chrome-mcp"`, `"agent-browser"`). | | `har_path` | `string \| null` | Absolute path to the captured HAR (`recording.har`). | | `script_path` | `string \| null` | Absolute path to the generated client when reverse engineering ran. | | `usage` | `object` | Normalized token plus cost usage (`input_tokens`, `output_tokens`, `cache_read_tokens`, `cache_write_tokens`, `total_cost_usd`) plus raw SDK usage under `raw`. | diff --git a/website/content/docs/configuration/agent.mdx b/website/content/docs/configuration/agent.mdx index 361e564..571483f 100644 --- a/website/content/docs/configuration/agent.mdx +++ b/website/content/docs/configuration/agent.mdx @@ -3,8 +3,7 @@ title: Agent configuration description: Configure the AI agent that drives the browser in agent mode. --- -Agent mode runs an AI loop that drives a browser via MCP. The provider you -pick determines *which* browser and *how* it's controlled. +Agent mode runs an AI loop. **`auto`** and **`chrome-mcp`** attach browser automation via **MCP**. **`agent-browser`** instead delegates browsing to the **[Vercel agent-browser CLI](https://github.com/vercel-labs/agent-browser)**; Reverse API Engineer looks for `agent-browser` on `PATH`, otherwise runs **`npm install -g …`** (using the same pin as config), surfaces a console notice, verifies **`--help`**, and only falls back to **`npx -y …`** if npm cannot complete the install. ## Agent providers @@ -31,6 +30,26 @@ profile. - Node.js 20.19+ - Auto-connect enabled at `chrome://inspect/#remote-debugging` +### `agent-browser` + +Uses **[Vercel agent-browser](https://github.com/vercel-labs/agent-browser)** as a standalone CLI—not a Reverse API Engineer MCP shim. + +After the CLI is available, Reverse API Engineer embeds the resolved shell prefix in the user prompt (**`agent-browser …`** vs **`npx -y …`**) and validates with **`--help`**. Prompts steer the model through **`skills get core --full`** (retry with **`skills list`** / matching bundles when operators name a cloud/runtime backend via `agent_browser_notes`). Claude SDK runs widen the tool allow-list for Bash-heavy flows; OpenCode/Copilot/Cursor omit Reverse API Engineer browser MCP registration for `agent-browser` and rely on their own shell primitives. + +Suggested JSON knobs: + +```json +{ + "agent_provider": "agent-browser", + "agent_browser_npx_package": "agent-browser@0", + "agent_browser_notes": "Optional operator guidance (AgentCore endpoints, corp proxy tweaks, sandbox IDs, …)" +} +``` + +You can alternatively drive the same knobs with **`RAE_AGENT_BROWSER_PACKAGE`** or **`RAE_AGENT_BROWSER_NOTES`**. + +First-time infra still needs Chromium bits just like upstream docs describe (`agent-browser install`, `--with-deps` on trimmed Linux VMs). + ## How to switch providers In the CLI: diff --git a/website/content/docs/installation.mdx b/website/content/docs/installation.mdx index 4a7b612..5c1d71e 100644 --- a/website/content/docs/installation.mdx +++ b/website/content/docs/installation.mdx @@ -7,8 +7,7 @@ description: Install Reverse API Engineer with uv or pip, then set up Playwright - **Python 3.11+** - SDK credentials for your selected backend. For the default Claude SDK, set **`ANTHROPIC_API_KEY`**. -- **Node.js + npx** for agent mode. `chrome-mcp` specifically requires - Node.js 20.19+. +- **Node.js + npx** for agent mode tooling (browser MCP stacks on `auto` / `chrome-mcp`, plus **`npx` bootstrapping** for **`agent-browser`**). `chrome-mcp` targets Node.js 20.19+. - A few minutes diff --git a/website/content/docs/modes/agent.mdx b/website/content/docs/modes/agent.mdx index d11ba4f..bb2043c 100644 --- a/website/content/docs/modes/agent.mdx +++ b/website/content/docs/modes/agent.mdx @@ -4,8 +4,7 @@ description: Fully autonomous browser interaction. The AI browses, captures, and --- Agent mode hands the steering wheel to the AI. Describe the goal in plain -English and the agent drives a browser through MCP while reverse-engineering -the API in real time. +English; providers such as Playwright MCP or Chrome DevTools MCP route browser actions over MCP, whereas **`agent-browser`** shells upstream's CLI directly after Reverse API Engineer resolves the binary (PATH **`agent-browser`**, else **`npm install -g `**, else **`npx -y`**) and prompt-injected skill workflows. + **Playwright MCP** browser automation with your configured SDK. Combines browser control and real-time reverse-engineering in a single workflow. @@ -62,6 +61,9 @@ Pick the provider in `/settings` then "agent provider". - Node.js 20.19+ - Auto-connect enabled at `chrome://inspect/#remote-debugging` + + **`agent-browser`** is the standalone [Vercel CLI](https://github.com/vercel-labs/agent-browser). Reverse API Engineer publishes a banner when it **`npm install -g`'s** the pin, verifies **`--help`**, prefers the PATH shim across runs, falls back to **`npx`** only when npm fails, injects **`skills get core --full`**, **`network har …`**, `@eN` snapshots, and **`skills list` / `skills get`** when clouds come up (see **`agent_browser_notes`**). Pins live in **`agent_browser_npx_package`** / **`RAE_AGENT_BROWSER_PACKAGE`**. Shell access is Bash-first on Claude SDK runs. + ## Reasoning model