Skip to content

aaronagent/browsertrace

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BrowserTrace

See what your AI browser agent did and why it failed.

demo

v0.1 alpha · MIT · single-machine · no signup · no cloud


It's 3 AM. Your browser agent crashed mid-run. You have 1500 lines of logs, no screenshots, an expired cookie, and a closed browser. You don't know what page it was on, what selector it tried, or what the model thought before clicking.

BrowserTrace is the recorder you wish you had. One decorator, every step captured, local timeline UI. Find the bug in 30 seconds, not 30 minutes.

Install

Requires Python 3.11+.

# Library only (record traces from your code, no UI):
pip install git+https://github.com/aaronagent/browsertrace

# With the local web UI (`browsertrace` command + screenshots viewer):
pip install "browsertrace[ui] @ git+https://github.com/aaronagent/browsertrace"

# Optional, for the Playwright examples:
playwright install chromium

See it in 60 seconds

git clone https://github.com/aaronagent/browsertrace && cd browsertrace
pip install -e ".[ui]"
pip install playwright && playwright install chromium
python examples/multipage_failure.py    # a research agent fails on Wikipedia
browsertrace                            # opens http://127.0.0.1:3000

Click the failed research agent: find Tokyo's population run. You'll see 4 screenshots (Wikipedia → search → article → failure), the exact moment the agent picked the wrong selector, and the model output expanded inline showing the bad decision.

Use it in your own code

Decorator (simplest)

@trace works on both sync and async functions. The first argument receives the active Run, so you can call run.step(...) (or await run.snapshot(page)) from inside.

from browsertrace import trace

# sync
@trace
def my_agent(run, query: str):
    run.step(action=f"search: {query}", screenshot=...)
    run.step(action="click first result", screenshot=...)

# async
@trace
async def my_async_agent(run, page, query: str):
    await run.snapshot(page, action=f"search: {query}")
    await run.snapshot(page, action="click first result")

my_agent("browser agent debugging")

Playwright shortcut

If you have a Playwright page, use run.snapshot(page, action=...) to skip the url=page.url, screenshot=await page.screenshot() boilerplate:

async with tracer.run("my-task") as run:
    await page.goto("https://example.com")
    await run.snapshot(page, action="opened example.com")
    await page.click("#login")
    await run.snapshot(page, action="clicked login")

For playwright.sync_api, use run.snapshot_sync(page, ...) instead.

Context manager (more control)

from browsertrace import Tracer

tracer = Tracer()

with tracer.run("my-task") as run:
    run.step(
        action="click login button",
        url=page.url,
        screenshot=await page.screenshot(),     # bytes or path; optional
        model_input={"prompt": "..."},          # optional
        model_output={"selector": "#login"},    # optional
        retries=0,                              # extra metadata via kwargs
    )

If the with block raises, the run is marked failed and the error message is recorded against the last step.

Browser Use integration

from browser_use import Agent
from browsertrace import Tracer
from browsertrace.integrations.browser_use import attach_tracer

tracer = Tracer()
agent = Agent(task="...", llm=ChatOpenAI(model="gpt-4o"))

with attach_tracer(agent, tracer, name="my run"):
    await agent.run()

Stagehand integration

from stagehand import Stagehand
from browsertrace import Tracer
from browsertrace.integrations.stagehand import wrap_stagehand

tracer = Tracer()
stagehand = await Stagehand(...).init()
page = wrap_stagehand(stagehand.page, tracer, name="my run")

await page.goto("https://example.com")
await page.act("click the login button")   # auto-recorded
await page.extract("get the headline")      # auto-recorded
page.bt_run.close()

Playwright

See examples/playwright_example.py and examples/multipage_failure.py.

Storage and config

What Where How to override
SQLite db + screenshots ~/.browsertrace/ Tracer(home="...") or BROWSERTRACE_HOME=/path browsertrace
UI port 3000 BROWSERTRACE_PORT=4000 browsertrace

What gets recorded per step

field type notes
action string human description: "click", "type x"
url string page URL at the time
screenshot PNG bytes / path saved to ~/.browsertrace/screenshots/
model_input any JSON-able prompt / messages sent to the LLM
model_output any JSON-able LLM response / decision
status "ok" / "error" step-level status (red badge if error)
error string error message if status is "error"
**metadata any JSON-able retries, latency, anything else
timestamp float (epoch) auto

Programmatic access

Every trace is also a JSON object you can feed back to an LLM for self-debugging or pipe into other tools.

# List runs (most recent first; ?status=failed and ?q= filters work)
curl http://127.0.0.1:3000/api/runs
curl 'http://127.0.0.1:3000/api/runs?status=failed&q=tokyo&limit=20'

# Full timeline for one run
curl http://127.0.0.1:3000/api/run/<run_id>

# AI root-cause summary (set OPENAI_API_KEY first; or pip install browsertrace[ai])
curl http://127.0.0.1:3000/api/run/<run_id>/summary

Each run JSON includes the run, every step, model I/O, status, errors, relative timestamps, and first_error_index so an LLM can jump straight to what broke.

Command line

browsertrace                      # serve the web UI
browsertrace list                 # list recent runs in the terminal
browsertrace show <id-or-prefix>  # print a run's timeline
browsertrace export <id> -o run.html   # self-contained HTML bundle (screenshots inlined)

export produces a single HTML file you can email, attach to an issue, or upload anywhere. No server, no DB, fully portable.

Why not just use ___?

Tool Strength Why you might still want BrowserTrace
Langfuse / LangSmith / Helicone Great LLM call tracing, prompt + token + cost Not browser-agent-first: no DOM, no screenshot, no replay UI built around browser state
Browserbase Hosted browser runtime with built-in recordings Locks you into their runtime; BrowserTrace works with any local Playwright, Browser Use, computer use
Laminar Generic agent observability with browser session replay Heavier, hosted-first; BrowserTrace is local-first, ~700 LOC, drop in via decorator
BrowserTrace Local replay debugger built around the browser-agent failure loop OSS, runtime-agnostic, no signup, JSON API for AI self-debug

Smallest useful thing for "my browser agent failed, what happened" — drop in, fix the bug, get back to building.

Cloud / Team (coming soon)

Local BrowserTrace will always be free OSS. We're working on a hosted version for teams that need:

  • One-click share links for failed runs (send to a teammate, paste in a Slack thread, attach to a GitHub issue, no git clone required)
  • CI ingestion — upload traces from your test runs, get a digest of failures
  • Multi-run regression detection — "this DOM changed since last passing run"
  • Team workspaces, comments, retention beyond a single laptop

If you want it, open an issue with the cloud-interest label describing your agent setup and team size. Pricing will likely be:

Tier Price For
OSS Local Free Solo, local debugging
Solo Cloud $19/mo Individual dev, hosted share + AI summaries
Team $99/mo 5 seats, CI, workspaces, regression detection
Scale $249/mo High-volume, long retention
Enterprise Custom SSO, VPC, SOC2

Real-world feedback shapes what ships first.

Roadmap

  • v0.1 (you are here): SDK + local UI + screenshots + model I/O + step status
  • v0.2: One-command browsertrace export <run_id> static HTML bundle (shareable, redactable)
  • v0.3: Search and filter the run list + timeline (action / URL / model text)
  • v0.4: AI root-cause summary on failed runs (consumes /api/run/<id> JSON)
  • v0.5: Multi-run comparison ("did this regression appear after my last commit?")
  • v0.6: First-class Stagehand / Skyvern adapters
  • v0.7: Optional cloud share links

Contributing

This is a v0.1 alpha. The fastest way to help:

  1. Try it on a real agent. Open an issue with what broke or what you wished worked.
  2. If you build a Stagehand / Skyvern / computer use adapter, PRs welcome.
  3. If you have a screenshot of a beautiful failure trace, share it on X with @aaronagent — it's launch fuel.

License

MIT — see LICENSE.

About

See what your AI browser agent did and why it failed. Local OSS observability for Browser Use, Stagehand, Playwright + LLM, computer use.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors