Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions docs/content/docs/(configuration)/secrets.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,23 @@ Worker creates a GitHub bot account

The tool accepts an optional `category` parameter to override auto-categorization. If omitted, the same rules as the API apply -- known internal credentials are categorized as system, everything else as tool.

### Browser Tool Integration

The `browser_type` tool supports a `secret` parameter for typing sensitive values (passwords, tokens, API keys) into web form fields without exposing them in tool arguments, output, or LLM context.

```
Task says "sign in with password from GH_PASSWORD"
→ Worker calls browser_type(index: 2, secret: "GH_PASSWORD")
→ Secret store resolves "GH_PASSWORD" to its stored value
→ Value is typed into the password field via CDP
→ Tool output: "Typed secret 'GH_PASSWORD' into element at index 2"
→ The actual password never appears in tool arguments or results
```

This is the only safe way to enter credentials in the browser. The `text` parameter types values in plain text that appears in tool output and LLM context -- never use it for passwords or tokens. Both tool-category and system-category secrets can be resolved via the `secret` parameter, since the value is used internally by the browser tool and never exposed to subprocesses.

If a task references an environment variable for a credential (e.g. "use password from `$GH_PASSWORD`"), the secret should be stored in the secret store under that name and passed via the `secret` parameter. Do not shell out to `echo` or otherwise resolve credential values into plain text.

## Encryption at Rest

The store has two modes:
Expand Down
39 changes: 26 additions & 13 deletions docs/content/docs/(features)/browser.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -79,12 +79,27 @@ Single `browser` tool with an `action` discriminator. All actions share one argu
Act kinds:

- **`click`** -- Click an element by ref.
- **`type`** -- Click an element to focus it, then type `text` into it.
- **`type`** -- Click an element to focus it, then type `text` into it. For passwords and credentials, use `secret` instead of `text` (see below).
- **`press_key`** -- Press a keyboard key (e.g., `Enter`, `Tab`, `Escape`). With `element_ref`, presses on that element. Without, dispatches to the page.
- **`hover`** -- Move the mouse over an element.
- **`scroll_into_view`** -- Scroll an element into the viewport.
- **`focus`** -- Focus an element without clicking.

### Secure Text Entry

The `browser_type` tool accepts a `secret` parameter as an alternative to `text` for typing sensitive values into form fields. When `secret` is provided, the value is resolved from the [secret store](/docs/secrets) and typed into the element without ever appearing in tool arguments, output, or LLM context.

```
browser_type { index: 2, secret: "GH_PASSWORD" }
→ resolves GH_PASSWORD from the secret store
→ types the value into the element
→ returns: "Typed secret 'GH_PASSWORD' into element at index 2"
```

The `text` and `secret` parameters are mutually exclusive. Use `text` for non-sensitive input (usernames, search queries, form data) and `secret` for passwords, tokens, API keys, and any other credentials. Both tool-category and system-category secrets can be used.

If a task references a credential by environment variable name (e.g. "use password from `$GH_PASSWORD`"), the worker should pass that name to the `secret` parameter. Never shell out to `echo` or otherwise resolve credential values into plain text for browser entry.

### JavaScript

| Action | Required Args | Description |
Expand All @@ -96,19 +111,17 @@ Act kinds:
## Typical Workflow

```
browser { action: "launch" }
browser { action: "navigate", url: "https://example.com" }
browser { action: "snapshot" }
→ e0: [textbox] "Email", e1: [textbox] "Password", e2: [button] "Sign In"
browser { action: "act", act_kind: "click", element_ref: "e0" }
browser { action: "act", act_kind: "type", element_ref: "e0", text: "user@example.com" }
browser { action: "act", act_kind: "click", element_ref: "e1" }
browser { action: "act", act_kind: "type", element_ref: "e1", text: "hunter2" }
browser { action: "act", act_kind: "click", element_ref: "e2" }
browser { action: "snapshot" }
browser_launch {}
browser_navigate { url: "https://example.com" }
browser_snapshot {}
→ [index=0] textbox "Email", [index=1] textbox "Password", [index=2] button "Sign In"
browser_type { index: 0, text: "user@example.com" }
browser_type { index: 1, secret: "ACCOUNT_PASSWORD" } ← password from secret store
browser_click { index: 2 }
browser_snapshot {}
→ (new page state after login)
browser { action: "screenshot" }
browser { action: "close" }
browser_screenshot {}
browser_close {}
```

## Screenshots
Expand Down
36 changes: 27 additions & 9 deletions docs/design-docs/tool-nudging.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,20 @@ Workers must explicitly signal a terminal outcome via `set_status(kind: "outcome
```
Worker loop starts
→ LLM completion
→ If response includes tool calls → continue normally
→ If text-only response:
→ If response includes tool calls → continue normally (reset consecutive nudge counter)
→ If response has no tool calls:
→ Has outcome been signaled via set_status(kind: "outcome")? → allow exit
→ No outcome signal? → Terminate with "tool_nudge" reason → retry with nudge prompt
→ Max 2 retries per prompt request
→ If retries exhausted → worker fails (PromptCancelled)
→ No outcome signal?
→ Response has text OR is reasoning-only (no text, no tools)?
→ Terminate with "tool_nudge" reason → retry with nudge prompt
→ Max 2 consecutive text-only responses before failure
→ Any successful tool call resets the counter
→ If consecutive retries exhausted → worker fails (PromptCancelled)
→ After initial task loop: if result text is empty → worker fails (safety net)
```

The retry budget tracks **consecutive** text-only responses, not total nudges across the worker's lifetime. Whenever a tool call completes successfully, the counter resets to zero. This prevents long-running workers from being killed after a brief narration blip mid-task — a worker that has made 12 tool calls and then narrates once gets the full retry budget again, not a depleted one from earlier nudges.

### Outcome Signaling

The `set_status` tool has a `kind` field:
Expand Down Expand Up @@ -62,15 +68,24 @@ let hook = SpacebotHook::new(...)
- The flag persists for the rest of the prompt request

**Nudge decision** (`src/hooks/spacebot.rs:should_nudge_tool_usage`):
- Returns `true` when: policy enabled, nudge active, no outcome signaled, response is text-only
- Returns `true` when: policy enabled, nudge active, no outcome signaled, AND response has no tool calls (either non-empty text or reasoning-only with no text)
- Returns `false` when: outcome signaled, response has tool calls, policy disabled

**Empty result guard** (`src/agent/worker.rs`):
- After the initial task loop, if the result text is empty/whitespace-only, the worker fails instead of reaching `Done`. This is a safety net for cases where a reasoning-only response slips past the nudge gate (e.g. after nudge retries are exhausted).

**Consecutive nudge counter** (`on_tool_result` in `src/hooks/spacebot.rs`):
- When nudge policy is enabled and a tool call completes, `nudge_attempts` is reset to zero
- This means the retry budget is per-consecutive-text-only-streak, not per-prompt-lifetime
- A worker that alternates between tool calls and narration will never exhaust its budget

**Retry flow** (`prompt_with_tool_nudge_retry`):
1. Reset nudge state at start of prompt (clears `outcome_signaled`)
1. Reset nudge state at start of prompt (clears `outcome_signaled` and `nudge_attempts`)
2. On text-only response without outcome: terminate with `TOOL_NUDGE_REASON`
3. Catch termination in retry loop, prune history, retry with nudge prompt
3. Catch termination in retry loop, increment `nudge_attempts`, prune history, retry with nudge prompt
4. On success: prune the nudge prompt from history to keep context clean
5. After `TOOL_NUDGE_MAX_RETRIES` (2) exhausted: `PromptCancelled` propagates to worker → `WorkerState::Failed`
5. After `TOOL_NUDGE_MAX_RETRIES` (2) consecutive text-only responses: `PromptCancelled` propagates to worker → `WorkerState::Failed`
6. Any successful tool call (inside the Rig agent loop) resets `nudge_attempts` to zero via `on_tool_result`

**History hygiene**:
- Synthetic nudge prompts are removed from history on both success and retry
Expand Down Expand Up @@ -100,6 +115,9 @@ The nudging behavior has comprehensive test coverage:
- **Unit tests** (`src/hooks/spacebot.rs`):
- `nudges_on_every_text_only_response_without_outcome` — nudge fires on every text-only response
- `nudges_after_tool_calls_without_outcome` — the exact bug case (read_skill + progress status + text exit)
- `nudges_on_reasoning_only_response_without_outcome` — nudge fires when response is only Reasoning blocks (no text, no tools)
- `tool_result_resets_consecutive_nudge_counter` — successful tool calls reset the nudge attempt counter
- `nudge_counter_not_reset_when_policy_disabled` — counter untouched when nudging is disabled
- `outcome_signal_allows_text_only_completion` — outcome signal unlocks exit
- `progress_status_does_not_signal_outcome` — explicit progress kind doesn't unlock
- `default_status_kind_does_not_signal_outcome` — omitted kind doesn't unlock
Expand Down
10 changes: 9 additions & 1 deletion interface/src/api/client.ts
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,13 @@ export interface OpenCodePartUpdatedEvent {
part: OpenCodePart;
}

export interface WorkerTextEvent {
type: "worker_text";
agent_id: string;
worker_id: string;
text: string;
}

export type ApiEvent =
| InboundMessageEvent
| OutboundMessageEvent
Expand All @@ -159,7 +166,8 @@ export type ApiEvent =
| BranchCompletedEvent
| ToolStartedEvent
| ToolCompletedEvent
| OpenCodePartUpdatedEvent;
| OpenCodePartUpdatedEvent
| WorkerTextEvent;

async function fetchJson<T>(path: string): Promise<T> {
const response = await fetch(`${API_BASE}${path}`);
Expand Down
Loading
Loading