Skip to content

feat: add stagehand#4

Merged
kalil0321 merged 2 commits into
mainfrom
feat/stagehand-browser-agent
Dec 25, 2025
Merged

feat: add stagehand#4
kalil0321 merged 2 commits into
mainfrom
feat/stagehand-browser-agent

Conversation

@kalil0321
Copy link
Copy Markdown
Owner

No description provided.

@kalil0321
Copy link
Copy Markdown
Owner Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread uv.lock
@kalil0321 kalil0321 merged commit 61e16bf into main Dec 25, 2025
1 check passed
@kalil0321 kalil0321 deleted the feat/stagehand-browser-agent branch December 27, 2025 23:22
@kalil0321 kalil0321 restored the feat/stagehand-browser-agent branch April 28, 2026 22:54
kalil0321 added a commit that referenced this pull request May 5, 2026
…-version

Knocks out items #3, #4, #9 of the agent-friendliness backlog (#62).
Schema version stays at 1 since the contract has not shipped to prod —
fields are added in place rather than bumping versions.

Item #3 — Stable `usage` subset
  Different SDKs (Claude / OpenCode / Copilot) emit different keys for
  the same concepts (cache_creation_input_tokens vs cache_write_tokens,
  estimated_cost_usd vs total_cost vs total_cost_usd, etc). New helper
  `_normalize_usage()` maps SDK-native keys into a stable subset
  {input_tokens, output_tokens, cache_read_tokens, cache_write_tokens,
  total_cost_usd} and parks the SDK-native dict under `usage.raw` for
  power users. Wrappers can rely on the top-level keys without breaking
  when the user switches SDK.

Item #4 — Machine-readable `error_kind`
  Previously agents had to pattern-match on prose like "[Errno 13]
  Permission denied: '/x'" to decide whether to retry / abort / surface
  to the user. New `error_kind` field on every `agent --json` and
  `engineer --json` payload, with a fixed enum:
    misuse | config_invalid | permission_denied | network |
    engine_failure | interrupted | unknown
  Inferred via `_classify_error()` which uses isinstance checks on
  exceptions (KeyboardInterrupt, PermissionError, ConnectionError,
  TimeoutError) and substring fallback on plain messages. Misuse paths
  now pass `error_kind_hint="misuse"` explicitly.

Item #9 — `--json-schema-version`
  Top-level `reverse-api-engineer --json-schema-version` prints the
  schema version (currently `1`) and exits 0. Lets a wrapper query the
  contract version without having to invoke a real run.

Bonus: KeyboardInterrupt now formats as the literal "interrupted" in
the `error` field (str(KeyboardInterrupt()) is empty); empty exception
messages fall back to the class name.

Tests: 9 new tests in test_cli_followups.py (TestSchemaV2Normalization,
TestJsonSchemaVersionFlag) + updated existing tests to assert on the
normalized usage shape and the new error_kind field.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
kalil0321 added a commit that referenced this pull request May 6, 2026
…-version

Knocks out items #3, #4, #9 of the agent-friendliness backlog (#62).
Schema version stays at 1 since the contract has not shipped to prod —
fields are added in place rather than bumping versions.

Item #3 — Stable `usage` subset
  Different SDKs (Claude / OpenCode / Copilot) emit different keys for
  the same concepts (cache_creation_input_tokens vs cache_write_tokens,
  estimated_cost_usd vs total_cost vs total_cost_usd, etc). New helper
  `_normalize_usage()` maps SDK-native keys into a stable subset
  {input_tokens, output_tokens, cache_read_tokens, cache_write_tokens,
  total_cost_usd} and parks the SDK-native dict under `usage.raw` for
  power users. Wrappers can rely on the top-level keys without breaking
  when the user switches SDK.

Item #4 — Machine-readable `error_kind`
  Previously agents had to pattern-match on prose like "[Errno 13]
  Permission denied: '/x'" to decide whether to retry / abort / surface
  to the user. New `error_kind` field on every `agent --json` and
  `engineer --json` payload, with a fixed enum:
    misuse | config_invalid | permission_denied | network |
    engine_failure | interrupted | unknown
  Inferred via `_classify_error()` which uses isinstance checks on
  exceptions (KeyboardInterrupt, PermissionError, ConnectionError,
  TimeoutError) and substring fallback on plain messages. Misuse paths
  now pass `error_kind_hint="misuse"` explicitly.

Item #9 — `--json-schema-version`
  Top-level `reverse-api-engineer --json-schema-version` prints the
  schema version (currently `1`) and exits 0. Lets a wrapper query the
  contract version without having to invoke a real run.

Bonus: KeyboardInterrupt now formats as the literal "interrupted" in
the `error` field (str(KeyboardInterrupt()) is empty); empty exception
messages fall back to the class name.

Tests: 9 new tests in test_cli_followups.py (TestSchemaV2Normalization,
TestJsonSchemaVersionFlag) + updated existing tests to assert on the
normalized usage shape and the new error_kind field.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant