See what your AI browser agent did and why it failed.
v0.1 alpha · MIT · single-machine · no signup · no cloud
It's 3 AM. Your browser agent crashed mid-run. You have 1500 lines of logs, no screenshots, an expired cookie, and a closed browser. You don't know what page it was on, what selector it tried, or what the model thought before clicking.
BrowserTrace is the recorder you wish you had. One decorator, every step captured, local timeline UI. Find the bug in 30 seconds, not 30 minutes.
Requires Python 3.11+.
# Library only (record traces from your code, no UI):
pip install git+https://github.com/aaronagent/browsertrace
# With the local web UI (`browsertrace` command + screenshots viewer):
pip install "browsertrace[ui] @ git+https://github.com/aaronagent/browsertrace"
# Optional, for the Playwright examples:
playwright install chromiumgit clone https://github.com/aaronagent/browsertrace && cd browsertrace
pip install -e ".[ui]"
pip install playwright && playwright install chromium
python examples/multipage_failure.py # a research agent fails on Wikipedia
browsertrace # opens http://127.0.0.1:3000Click the failed research agent: find Tokyo's population run.
You'll see 4 screenshots (Wikipedia → search → article → failure), the exact
moment the agent picked the wrong selector, and the model output expanded
inline showing the bad decision.
@trace works on both sync and async functions. The first argument receives the
active Run, so you can call run.step(...) (or await run.snapshot(page))
from inside.
from browsertrace import trace
# sync
@trace
def my_agent(run, query: str):
run.step(action=f"search: {query}", screenshot=...)
run.step(action="click first result", screenshot=...)
# async
@trace
async def my_async_agent(run, page, query: str):
await run.snapshot(page, action=f"search: {query}")
await run.snapshot(page, action="click first result")
my_agent("browser agent debugging")If you have a Playwright page, use run.snapshot(page, action=...) to skip
the url=page.url, screenshot=await page.screenshot() boilerplate:
async with tracer.run("my-task") as run:
await page.goto("https://example.com")
await run.snapshot(page, action="opened example.com")
await page.click("#login")
await run.snapshot(page, action="clicked login")For playwright.sync_api, use run.snapshot_sync(page, ...) instead.
from browsertrace import Tracer
tracer = Tracer()
with tracer.run("my-task") as run:
run.step(
action="click login button",
url=page.url,
screenshot=await page.screenshot(), # bytes or path; optional
model_input={"prompt": "..."}, # optional
model_output={"selector": "#login"}, # optional
retries=0, # extra metadata via kwargs
)If the with block raises, the run is marked failed and the error message
is recorded against the last step.
from browser_use import Agent
from browsertrace import Tracer
from browsertrace.integrations.browser_use import attach_tracer
tracer = Tracer()
agent = Agent(task="...", llm=ChatOpenAI(model="gpt-4o"))
with attach_tracer(agent, tracer, name="my run"):
await agent.run()from stagehand import Stagehand
from browsertrace import Tracer
from browsertrace.integrations.stagehand import wrap_stagehand
tracer = Tracer()
stagehand = await Stagehand(...).init()
page = wrap_stagehand(stagehand.page, tracer, name="my run")
await page.goto("https://example.com")
await page.act("click the login button") # auto-recorded
await page.extract("get the headline") # auto-recorded
page.bt_run.close()See examples/playwright_example.py and examples/multipage_failure.py.
| What | Where | How to override |
|---|---|---|
| SQLite db + screenshots | ~/.browsertrace/ |
Tracer(home="...") or BROWSERTRACE_HOME=/path browsertrace |
| UI port | 3000 |
BROWSERTRACE_PORT=4000 browsertrace |
| field | type | notes |
|---|---|---|
action |
string | human description: "click", "type x" |
url |
string | page URL at the time |
screenshot |
PNG bytes / path | saved to ~/.browsertrace/screenshots/ |
model_input |
any JSON-able | prompt / messages sent to the LLM |
model_output |
any JSON-able | LLM response / decision |
status |
"ok" / "error" |
step-level status (red badge if error) |
error |
string | error message if status is "error" |
**metadata |
any JSON-able | retries, latency, anything else |
timestamp |
float (epoch) | auto |
Every trace is also a JSON object you can feed back to an LLM for self-debugging or pipe into other tools.
# List runs (most recent first; ?status=failed and ?q= filters work)
curl http://127.0.0.1:3000/api/runs
curl 'http://127.0.0.1:3000/api/runs?status=failed&q=tokyo&limit=20'
# Full timeline for one run
curl http://127.0.0.1:3000/api/run/<run_id>
# AI root-cause summary (set OPENAI_API_KEY first; or pip install browsertrace[ai])
curl http://127.0.0.1:3000/api/run/<run_id>/summaryEach run JSON includes the run, every step, model I/O, status, errors, relative
timestamps, and first_error_index so an LLM can jump straight to what broke.
browsertrace # serve the web UI
browsertrace list # list recent runs in the terminal
browsertrace show <id-or-prefix> # print a run's timeline
browsertrace export <id> -o run.html # self-contained HTML bundle (screenshots inlined)export produces a single HTML file you can email, attach to an issue, or
upload anywhere. No server, no DB, fully portable.
| Tool | Strength | Why you might still want BrowserTrace |
|---|---|---|
| Langfuse / LangSmith / Helicone | Great LLM call tracing, prompt + token + cost | Not browser-agent-first: no DOM, no screenshot, no replay UI built around browser state |
| Browserbase | Hosted browser runtime with built-in recordings | Locks you into their runtime; BrowserTrace works with any local Playwright, Browser Use, computer use |
| Laminar | Generic agent observability with browser session replay | Heavier, hosted-first; BrowserTrace is local-first, ~700 LOC, drop in via decorator |
| BrowserTrace | Local replay debugger built around the browser-agent failure loop | OSS, runtime-agnostic, no signup, JSON API for AI self-debug |
Smallest useful thing for "my browser agent failed, what happened" — drop in, fix the bug, get back to building.
Local BrowserTrace will always be free OSS. We're working on a hosted version for teams that need:
- One-click share links for failed runs (send to a teammate, paste in a Slack
thread, attach to a GitHub issue, no
git clonerequired) - CI ingestion — upload traces from your test runs, get a digest of failures
- Multi-run regression detection — "this DOM changed since last passing run"
- Team workspaces, comments, retention beyond a single laptop
If you want it, open an issue with the cloud-interest label describing your agent setup and team size. Pricing will likely be:
| Tier | Price | For |
|---|---|---|
| OSS Local | Free | Solo, local debugging |
| Solo Cloud | $19/mo | Individual dev, hosted share + AI summaries |
| Team | $99/mo | 5 seats, CI, workspaces, regression detection |
| Scale | $249/mo | High-volume, long retention |
| Enterprise | Custom | SSO, VPC, SOC2 |
Real-world feedback shapes what ships first.
- v0.1 (you are here): SDK + local UI + screenshots + model I/O + step status
- v0.2: One-command
browsertrace export <run_id>static HTML bundle (shareable, redactable) - v0.3: Search and filter the run list + timeline (action / URL / model text)
- v0.4: AI root-cause summary on failed runs (consumes
/api/run/<id>JSON) - v0.5: Multi-run comparison ("did this regression appear after my last commit?")
- v0.6: First-class Stagehand / Skyvern adapters
- v0.7: Optional cloud share links
This is a v0.1 alpha. The fastest way to help:
- Try it on a real agent. Open an issue with what broke or what you wished worked.
- If you build a Stagehand / Skyvern / computer use adapter, PRs welcome.
- If you have a screenshot of a beautiful failure trace, share it on X with
@aaronagent— it's launch fuel.
MIT — see LICENSE.
