Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Removed
- **`browser-use` and `stagehand` agent providers**: Both have been retired due to upstream churn. Agent mode now offers only `auto` (Playwright MCP) and `chrome-mcp` (Chrome DevTools MCP). Configs with `agent_provider` set to `browser-use` or `stagehand` migrate to `auto`; the `browser_use_model` and `stagehand_model` keys are dropped silently. The `[agent]` optional-dependency extra is removed.
- **Tag system (`@record-only`, `@codegen`, `@id`, `@docs`, `@help`)**: The inline tag syntax inside REPL prompts has been removed. Surviving capabilities have first-class CLI flags / arguments instead:
- `@record-only` → `manual --no-engineer` (already existed)
- `@id <run_id>` → `engineer <run_id>` (positional, already existed)
- `@id <run_id> <prompt>` → `engineer <run_id> --prompt "..."` (new flag; layered as additional instructions on top of the captured run's original goal)
- `@id <run_id> --fresh <prompt>` → `engineer <run_id> --fresh --prompt "..."` (new flag; with `--fresh`, `--prompt` fully replaces the original goal)
- `@codegen` and Playwright-action recording: removed entirely with the `ActionRecorder` and `playwright_codegen` modules (low usage, replaced by the standard capture+engineer flow)
- `@docs` (OpenAPI generation): removed for now; will return as a dedicated `docs` subcommand in a follow-up
- `@help`: superseded by the existing `/help` slash command in the REPL
- **`--reverse-engineer/--no-engineer` flag on `agent`**: was parsed but never propagated to the auto-capture path — removed. Agent mode runs an integrated capture + engineering pipeline; use `manual --no-engineer` for HAR-only recordings.

### Added
- **`engineer --prompt`** and **`engineer --fresh`** CLI flags to replace the equivalent `@id <run_id> [--fresh] <prompt>` REPL syntax.

## [0.7.1] - 2026-04-06

Expand Down
38 changes: 0 additions & 38 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,6 @@ No more manual reverse engineering—just browse, capture, and get clean API cod
- [Engineer Mode](#engineer-mode)
- [Agent Mode](#agent-mode)
- [Collector Mode](#collector-mode)
- [Tags](#tags)
- [Configuration](#-configuration)
- [Model Selection](#model-selection)
- [Agent Configuration](#agent-configuration)
Expand All @@ -57,7 +56,6 @@ No more manual reverse engineering—just browse, capture, and get clean API cod
- 📦 **Production Ready**: Generated scripts include error handling, type hints, and documentation
- 💾 **Session History**: All runs saved locally with full message logs
- 💰 **Cost Tracking**: Detailed token usage and cost estimation with cache support
- 🏷️ **Tag System**: Powerful tags for fine-grained control (@record-only, @codegen, @docs, @id)

### Limitations

Expand Down Expand Up @@ -200,42 +198,6 @@ $ reverse-api-engineer
# Data saved to: ./collected/js_frameworks/
```

## 🏷️ Tags

Tags provide additional control and functionality within each mode:

### Manual/Agent Mode Tags

- **`@record-only`** - Record HAR file only, skip reverse engineering step
- Example: `@record-only navigate checkout flow`
- Useful when you want to capture traffic for later analysis

- **`@codegen`** - Record browser actions and generate Playwright automation script
- Example: `@codegen navigate to google`
- Captures clicks, fills, and navigations to create a reusable Playwright script

### Engineer Mode Tags

- **`@id <run_id>`** - Switch context to a specific run ID
- Example: `@id abc123`
- Loads a previous capture session for re-engineering

- **`@id <run_id> <prompt>`** - Run engineer on a specific run with instructions
- Example: `@id abc123 extract user profile`
- Re-processes a capture with new instructions

- **`@id <run_id> --fresh <prompt>`** - Start fresh (ignore previous scripts)
- Example: `@id abc123 --fresh restart analysis`
- Generates new code from scratch, ignoring previous implementations

- **`@docs`** - Generate API documentation (OpenAPI spec) for the latest run
- Example: `@docs`
- Creates OpenAPI specification from captured traffic

- **`@id <run_id> @docs`** - Generate API documentation for a specific run
- Example: `@id abc123 @docs`
- Creates OpenAPI specification for a specific capture session

## 🔧 Configuration

Settings stored in `~/.reverse-api/config.json`:
Expand Down
51 changes: 0 additions & 51 deletions src/reverse_api/action_recorder.py

This file was deleted.

1 change: 0 additions & 1 deletion src/reverse_api/base_engineer.py
Original file line number Diff line number Diff line change
Expand Up @@ -504,7 +504,6 @@ def _build_prompts(self) -> tuple[str, str]:
scripts_dir=str(self.scripts_dir),
existing_client_guidance=self._get_existing_client_guidance(),
additional_instructions=additional_instructions,
tag_extra="@docs" if is_docs else "",
tag_mode_label="Documentation" if is_docs else "Re-engineer",
run_id=self.run_id,
har_parent=str(self.har_path.parent),
Expand Down
204 changes: 0 additions & 204 deletions src/reverse_api/browser.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@
from rich.console import Console
from rich.status import Status

from .action_recorder import ActionRecorder, RecordedAction
from .utils import get_har_dir, get_timestamp

console = Console()
Expand Down Expand Up @@ -188,13 +187,11 @@ def __init__(
prompt: str,
output_dir: str | None = None,
use_real_chrome: bool = True, # New option to use real Chrome
enable_action_recording: bool = False,
):
self.run_id = run_id
self.prompt = prompt
self.output_dir = output_dir
self.use_real_chrome = use_real_chrome
self.enable_action_recording = enable_action_recording

self.har_dir = get_har_dir(run_id, output_dir)
self.har_path = self.har_dir / "recording.har"
Expand All @@ -207,193 +204,6 @@ def __init__(
self._user_agent = random.choice(USER_AGENTS)
self._using_persistent = False # Track if using persistent context

self.action_recorder = ActionRecorder() if enable_action_recording else None

def _inject_action_recorder(self, page: Page) -> None:
"""Inject action recording script into page.

Uses console.log + page.on('console') for reliable capture.
Works best with stealth Chromium mode (not real Chrome).
"""
if not self.enable_action_recording:
return

# Simple JS that logs actions to console with a special prefix
recorder_js = """
window.__recordedActions = [];
window.__lastUrl = null;

document.addEventListener('click', (e) => {
const el = e.target;

// Build a robust selector by traversing up to find parent with ID
function buildSelector(element) {
// Priority 1: data-testid on element or close parent
let current = element;
for (let i = 0; i < 3 && current; i++) {
if (current.dataset && current.dataset.testid) {
const path = i === 0 ? '' : ' ' + getPathFromAncestor(current, element);
return '[data-testid="' + current.dataset.testid.replace(/"/g, '\\"') + '"]' + path;
}
current = current.parentElement;
}

// Priority 2: element has short ID
if (element.id && element.id.length < 20) {
return '#' + element.id;
}

// Priority 3: find parent with ID and build path
current = element.parentElement;
let depth = 1;
while (current && depth < 5) {
if (current.id && current.id.length < 20) {
const path = getPathFromAncestor(current, element);
const selector = '#' + current.id + ' > ' + path;
// Verify it's unique
if (document.querySelectorAll(selector).length === 1) {
return selector;
}
}
current = current.parentElement;
depth++;
}

// Priority 4: name attribute
if (element.name) {
return '[name="' + element.name.replace(/"/g, '\\"') + '"]';
}

// Priority 5: aria-label
if (element.getAttribute && element.getAttribute('aria-label')) {
return '[aria-label="' + element.getAttribute('aria-label').replace(/"/g, '\\"') + '"]';
}

// Priority 6: role + text for buttons
if ((element.tagName === 'BUTTON' || element.role === 'button') && element.innerText) {
const text = element.innerText.trim().substring(0, 30);
if (text && !text.includes('\\n')) {
return 'button:has-text(' + JSON.stringify(text) + ')';
}
}

// Priority 7: link text
if (element.tagName === 'A' && element.innerText) {
const text = element.innerText.trim().substring(0, 30);
if (text && !text.includes('\\n')) {
return 'a:has-text(' + JSON.stringify(text) + ')';
}
}

// Priority 8: tag + class
if (element.className && typeof element.className === 'string') {
const cls = element.className.split(' ').filter(c => c && c.length < 30 && !c.includes('hover') && !c.includes('active'))[0];
if (cls) {
const baseSelector = element.tagName.toLowerCase() + '.' + cls;
const matches = document.querySelectorAll(baseSelector);
if (matches.length === 1) return baseSelector;
const idx = Array.from(matches).indexOf(element);
if (idx >= 0) return baseSelector + ' >> nth=' + idx;
}
}

// Fallback: tag name
return element.tagName.toLowerCase();
}

// Get path from ancestor to descendant
function getPathFromAncestor(ancestor, descendant) {
if (ancestor === descendant) return descendant.tagName.toLowerCase();

let path = [];
let current = descendant;
while (current && current !== ancestor) {
path.unshift(current.tagName.toLowerCase());
current = current.parentElement;
}
return path.join(' > ');
}

const selector = buildSelector(el);
const action = {type: 'click', selector: selector, timestamp: Date.now()};
window.__recordedActions.push(action);
console.log('__ACTION__' + JSON.stringify(action));
}, true);

document.addEventListener('input', (e) => {
if (e.target.tagName === 'INPUT' || e.target.tagName === 'TEXTAREA') {
const el = e.target;
let selector = '';
if (el.id && el.id.length < 20) selector = '#' + el.id;
else if (el.name) selector = '[name="' + el.name.replace(/"/g, '\\"') + '"]';
else if (el.placeholder) selector = '[placeholder="' + el.placeholder.replace(/"/g, '\\"') + '"]';
else selector = el.tagName.toLowerCase();

const action = {type: 'fill', selector: selector, value: el.value, timestamp: Date.now()};
window.__recordedActions.push(action);
console.log('__ACTION__' + JSON.stringify(action));
}
}, true);

document.addEventListener('keydown', (e) => {
if (e.key === 'Enter') {
const el = e.target;
let selector = '';
if (el.id && el.id.length < 20) selector = '#' + el.id;
else if (el.name) selector = '[name="' + el.name.replace(/"/g, '\\"') + '"]';
else selector = el.tagName.toLowerCase();

const action = {type: 'press', selector: selector, value: 'Enter', timestamp: Date.now()};
window.__recordedActions.push(action);
console.log('__ACTION__' + JSON.stringify(action));
}
}, true);

// Only log navigation for top-level main frame, not iframes
if (window === window.top) {
const url = window.location.href;
// Skip about:blank, blob:, data:, service workers, embeds
if (url && !url.startsWith('about:') && !url.startsWith('blob:') &&
!url.startsWith('data:') && !url.includes('/embed') &&
!url.includes('service_worker') && !url.includes('googletagmanager')) {
// Only log if URL changed
if (url !== window.__lastUrl) {
window.__lastUrl = url;
console.log('__ACTION__' + JSON.stringify({type: 'navigate', url: url, timestamp: Date.now()}));
}
}
}
"""

# Listen to console for actions
import json

last_url = [None] # Mutable to track last URL

def on_console(msg):
text = msg.text
if text.startswith("__ACTION__"):
try:
action_json = text[10:] # Remove '__ACTION__' prefix
action_data = json.loads(action_json)

# Filter duplicate navigations
if action_data.get("type") == "navigate":
url = action_data.get("url", "")
if url == last_url[0]:
return # Skip duplicate
last_url[0] = url

if self.action_recorder:
self.action_recorder.add_action(RecordedAction(**action_data))
except Exception as e:
console.print(f" [dim]action parse error: {e}[/dim]")

page.on("console", on_console)
page.add_init_script(recorder_js)

console.print(" [dim]action recording enabled[/dim]")

def _save_metadata(self, end_time: str) -> None:
"""Save run metadata to JSON file."""
metadata = {
Expand Down Expand Up @@ -460,9 +270,6 @@ def _start_with_real_chrome(self, start_url: str | None = None) -> Path:
# For HAR recording & context
page = self._context.new_page()

if self.enable_action_recording:
self._inject_action_recorder(page)

if start_url:
page.goto(start_url, wait_until="domcontentloaded")
else:
Expand Down Expand Up @@ -555,9 +362,6 @@ def _start_with_stealth_chromium(self, start_url: str | None = None) -> Path:
# Open initial page
page = self._context.new_page()

if self.enable_action_recording:
self._inject_action_recorder(page)

if start_url:
page.goto(start_url, wait_until="domcontentloaded")
else:
Expand Down Expand Up @@ -652,13 +456,5 @@ def close(self) -> Path:
console.print(" [dim]capture saved[/dim]")
console.print(" [dim]metadata synced[/dim]")

if self.action_recorder:
try:
actions_path = self.har_dir / "actions.json"
self.action_recorder.save(actions_path)
console.print(" [dim]actions saved[/dim]")
except Exception as e:
console.print(f" [yellow]warning: error saving actions: {e}[/yellow]")

return self.har_path

Loading