Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,8 @@ The proxy supports the following OpenAI-compatible parameters in the `/v1/chat/c
- **`temperature`** (number): Controls randomness (passed to the engine).
- **`max_tokens`** (number): Limits the length of the generated response.
- **`reasoning_effort`** (string): For models with reasoning capabilities (e.g., `low`, `medium`, `high`).
- **`tools` / `tool_choice`**: Standard OpenAI tool-calling fields used by agentic clients.
- **`browseros_mode`** (boolean): Optional strict mode toggle for BrowserOS-like agentic clients. When tools are provided, this mode is **enabled by default** unless you explicitly set `browseros_mode: false`.

## Quick Start

Expand Down Expand Up @@ -70,6 +72,18 @@ curl -N -X POST http://localhost:8080/v1/chat/completions \
- **Port**: Set via `PORT` environment variable (defaults to 8080).
- **Models**: The proxy automatically queries your local Codex installation for available model slugs.

### BrowserOS Configuration

If your BrowserOS agent sends tool definitions but the model replies with text like _"I’m unable to control the browser from this environment."_, verify:

- you are sending `tools` in your `/v1/chat/completions` request body (this auto-enables BrowserOS strict mode)
- optionally set `browseros_mode: true` explicitly for clarity
- keep sending `tool_choice` when your client supports it

To disable strict BrowserOS behavior for non-agentic use-cases, set `browseros_mode: false`.

This proxy mode is designed to use the proxy only for LLM/provider behavior while BrowserOS continues to execute the actual browser tools on its side.

## Architecture

This project uses a typed `CodexClient` that manages a persistent `codex app-server` background process. Communication happens over a high-speed JSON-RPC channel on `stdio`, ensuring that the model state remains warm and ready for immediate inference.
Expand Down
23 changes: 22 additions & 1 deletion src/codex-client.ts
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,12 @@ export class CodexClient {

async *chatCompletionStream(
messages: Message[],
options: { model: string; tools?: any[]; tool_choice?: any },
options: {
model: string;
tools?: any[];
tool_choice?: any;
browseros_mode?: boolean;
},
): AsyncGenerator<CodexStreamEvent> {
const hasTools = options.tools && options.tools.length > 0;

Expand Down Expand Up @@ -151,6 +156,19 @@ export class CodexClient {
baseInstructions = (baseInstructions || "") + toolBlock;
}

if (hasTools && options.browseros_mode) {
const browserOSToolModeInstructions =
`\n\n## BrowserOS Tool Execution Mode\n\n` +
`You are running as a tool-calling planner inside BrowserOS. ` +
`You can and must control the browser by emitting tool calls. ` +
`Do not claim you cannot access or control the browser/environment. ` +
`If the request is actionable with available tools, your response MUST include at least one <tool_call> block. ` +
`Prefer tool-call-only output for action steps. ` +
`For shopping workflows, adding products to cart is permitted; avoid checkout/payment unless explicitly requested. ` +
`If an action needs multiple steps, emit only the next required tool call(s) for the current step.\n`;
baseInstructions = (baseInstructions || "") + browserOSToolModeInstructions;
}

// --- Format conversation messages into prompt ---
let fullPrompt = "";
for (const msg of nonSystemMessages) {
Expand Down Expand Up @@ -276,6 +294,9 @@ export class CodexClient {
}
eventQueue.push({ type: "tool_calls", calls: toolCalls });
} else {
console.warn(
`[CodexClient] Tools provided but no tool calls parsed. Assistant preview: ${accumulatedText.slice(0, 300).replace(/\s+/g, " ")}`,
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This warning logs a preview of the assistant output (accumulatedText) when tools are provided but no tool calls are parsed. That content may include sensitive user data or secrets and will end up in server logs. Consider gating this behind a debug flag and/or redacting content (e.g., log only length / hash / truncated-with-redaction) to reduce accidental data exposure.

Suggested change
`[CodexClient] Tools provided but no tool calls parsed. Assistant preview: ${accumulatedText.slice(0, 300).replace(/\s+/g, " ")}`,
`[CodexClient] Tools provided but no tool calls parsed. Assistant preview redacted (length=${accumulatedText.length}).`,

Copilot uses AI. Check for mistakes.
);
// No tool calls found, emit as plain message
eventQueue.push({ type: "message", text: accumulatedText });
}
Expand Down
108 changes: 91 additions & 17 deletions src/codex.ts
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ export interface CodexOptions {
signal?: AbortSignal;
tools?: any[];
tool_choice?: any;
browseros_mode?: boolean;
}

export interface ParsedToolCall {
Expand Down Expand Up @@ -74,27 +75,77 @@ export type CodexStreamEvent =
*/
export function parseToolCalls(text: string): ParsedToolCall[] {
const calls: ParsedToolCall[] = [];
const regex = /<tool_call>([\s\S]*?)<\/tool_call>/g;
let match;
const seen = new Set<string>();
let callIndex = 0;
while ((match = regex.exec(text)) !== null) {

const pushCall = (raw: any) => {
const name = raw?.name || raw?.toolName || raw?.function?.name || "";
const argsRaw =
raw?.arguments ?? raw?.input ?? raw?.parameters ?? raw?.function?.arguments;
if (!name) return;
const args =
typeof argsRaw === "string"
? argsRaw
: JSON.stringify(argsRaw ?? {});
Comment on lines +82 to +89
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parseToolCalls will currently treat any JSON object containing a name field (e.g., { "name": "Alice" }), or any JSON code fence containing such an object, as a tool call. With tools enabled, this can lead to false-positive tool executions if the model outputs or echoes arbitrary JSON. Consider tightening detection (e.g., require the canonical tool-call shape like {name, arguments} / {function:{name,arguments}}), and ideally validate that name matches one of the provided tool definitions before emitting a tool call.

Suggested change
const name = raw?.name || raw?.toolName || raw?.function?.name || "";
const argsRaw =
raw?.arguments ?? raw?.input ?? raw?.parameters ?? raw?.function?.arguments;
if (!name) return;
const args =
typeof argsRaw === "string"
? argsRaw
: JSON.stringify(argsRaw ?? {});
if (!raw || typeof raw !== "object") return;
// Support only canonical tool-call shapes:
// 1) { function: { name, arguments } }
// 2) { name, arguments }
let name: string | undefined;
let argsRaw: any;
if (raw.function && typeof raw.function === "object") {
name = typeof raw.function.name === "string" ? raw.function.name : undefined;
argsRaw = raw.function.arguments;
} else {
name = typeof raw.name === "string" ? raw.name : undefined;
argsRaw = raw.arguments;
}
// Require both a valid name and explicit arguments to treat this as a tool call.
if (!name || argsRaw === undefined) return;
const args =
typeof argsRaw === "string" ? argsRaw : JSON.stringify(argsRaw);

Copilot uses AI. Check for mistakes.
const key = `${name}::${args}`;
if (seen.has(key)) return;
seen.add(key);
calls.push({
id: `call_${Date.now()}_${callIndex++}`,
type: "function",
function: {
name,
arguments: args,
},
});
};

// Format 1: explicit <tool_call>...</tool_call> blocks.
const taggedRegex = /<tool_call>([\s\S]*?)<\/tool_call>/g;
let match;
while ((match = taggedRegex.exec(text)) !== null) {
try {
pushCall(JSON.parse(match[1].trim()));
} catch {
// Ignore malformed block.
}
}

// Format 2: JSON fenced blocks that contain a single call, call list, or tool_calls.
const fencedJsonRegex = /```(?:json)?\s*([\s\S]*?)```/g;
while ((match = fencedJsonRegex.exec(text)) !== null) {
const candidate = match[1].trim();
try {
const parsed = JSON.parse(match[1].trim());
calls.push({
id: `call_${Date.now()}_${callIndex++}`,
type: "function",
function: {
name: parsed.name || parsed.function?.name || "",
arguments:
typeof parsed.arguments === "string"
? parsed.arguments
: JSON.stringify(parsed.arguments ?? parsed.parameters ?? {}),
},
});
const parsed = JSON.parse(candidate);
if (Array.isArray(parsed)) {
for (const item of parsed) pushCall(item);
} else if (parsed?.tool_calls && Array.isArray(parsed.tool_calls)) {
for (const item of parsed.tool_calls) pushCall(item);
} else {
pushCall(parsed);
}
} catch {
// Not valid JSON; ignore.
}
}

// Format 3: whole response is a JSON object/array describing tool calls.
const trimmed = text.trim();
if (trimmed.startsWith("{") || trimmed.startsWith("[")) {
try {
const parsed = JSON.parse(trimmed);
if (Array.isArray(parsed)) {
for (const item of parsed) pushCall(item);
} else if (parsed?.tool_calls && Array.isArray(parsed.tool_calls)) {
for (const item of parsed.tool_calls) pushCall(item);
} else {
pushCall(parsed);
}
} catch {
// Skip malformed tool calls
// Not parseable as JSON; ignore.
}
}

return calls;
}

Expand All @@ -104,7 +155,19 @@ export function parseToolCalls(text: string): ParsedToolCall[] {
* are available and the expected output format.
*/
export function buildToolInstructions(tools: any[], tool_choice?: any): string {
let block = `\n\n## Available Tools\n\nYou have access to the following tools to perform actions. You MUST use these tools to fulfill the user's request. Do NOT describe steps or give instructions — instead, call the appropriate tool.\n\nTo call a tool, output one or more tool calls in this exact format (you may output multiple for parallel execution):\n<tool_call>{"name": "tool_name", "arguments": {"param": "value"}}</tool_call>\n\nIMPORTANT RULES:\n- ALWAYS use tool calls to act. NEVER respond with step-by-step instructions when a tool can do the job.\n- You can call multiple tools in a single response.\n- After a tool call, wait for the result before proceeding.\n- If the user asks you to navigate somewhere, use the navigate tool. If they ask you to click, use the click tool. Etc.\n\nHere are the tools:\n\n`;
let block =
`\n\n## Available Tools\n\n` +
`You are an agentic planner operating through external tools. ` +
`When tools are available, your next action MUST be emitted as tool calls, not prose refusals.\n\n` +
`Tool call output format (required):\n` +
`<tool_call>{"name": "tool_name", "arguments": {"param": "value"}}</tool_call>\n\n` +
`IMPORTANT RULES:\n` +
`- If a user request is actionable with provided tools, emit one or more <tool_call> blocks.\n` +
`- Do not say you cannot access the browser/environment when browser tools are provided.\n` +
`- Keep normal text minimal. Prefer tool-call-only responses for action steps.\n` +
`- After tool results are returned, emit the next tool call(s) needed to continue.\n` +
`- For commerce tasks, adding an item to cart is allowed; do not attempt checkout/payment unless user explicitly requests it.\n\n` +
`Here are the tools:\n\n`;

for (const tool of tools) {
if (tool.type === "function" && tool.function) {
Expand All @@ -115,6 +178,16 @@ export function buildToolInstructions(tools: any[], tool_choice?: any): string {
block += `Parameters: ${JSON.stringify(fn.parameters)}\n`;
}
block += `\n`;
} else if (tool?.name) {
// Support alternate tool schemas used by some providers/agents.
block += `### ${tool.name}\n`;
if (tool.description) block += `${tool.description}\n`;
if (tool.input_schema) {
block += `Parameters: ${JSON.stringify(tool.input_schema)}\n`;
} else if (tool.parameters) {
block += `Parameters: ${JSON.stringify(tool.parameters)}\n`;
}
block += `\n`;
}
}

Expand All @@ -141,5 +214,6 @@ export async function* execCodexStream(
model: options.model,
tools: options.tools,
tool_choice: options.tool_choice,
browseros_mode: options.browseros_mode,
});
}
21 changes: 19 additions & 2 deletions src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -56,8 +56,13 @@ Bun.serve({
const temperature = body.temperature;
const max_tokens = body.max_tokens;
const reasoning_effort = body.reasoning_effort;
const tools = body.tools;
const tool_choice = body.tool_choice;
const tools = Array.isArray(body.tools) ? body.tools : undefined;
// Default to BrowserOS-style strict tool mode whenever tools are supplied,
// unless callers explicitly disable it with browseros_mode: false.
const browseros_mode =
tools && tools.length > 0 ? body.browseros_mode !== false : false;
const tool_choice =
body.tool_choice ?? (browseros_mode ? "required" : undefined);

const stream = body.stream === true;

Expand All @@ -69,6 +74,16 @@ Bun.serve({
if (tools) {
console.log(`[Proxy] Tools count: ${tools.length}`);
}
if (tools && tools.length > 0) {
console.log(
`[Proxy] BrowserOS mode: ${browseros_mode ? "enabled" : "disabled"}`,
);
if (body.browseros_mode === undefined && browseros_mode) {
console.log(
`[Proxy] BrowserOS mode auto-enabled because tools were provided`,
);
}
}

if (stream) {
const responseId = `chatcmpl-${Date.now()}`;
Expand All @@ -88,6 +103,7 @@ Bun.serve({
signal: req.signal,
tools,
tool_choice,
browseros_mode,
})) {
if (req.signal.aborted) break;

Expand Down Expand Up @@ -269,6 +285,7 @@ Bun.serve({
signal: req.signal,
tools,
tool_choice,
browseros_mode,
})) {
if (req.signal.aborted) break;
if (event.type === "message") {
Expand Down
Loading