diff --git a/docs/TODO.md b/docs/TODO.md index a70e0bfbd..097620200 100644 --- a/docs/TODO.md +++ b/docs/TODO.md @@ -1,25 +1,140 @@ # Documentation TODO -## Installation -- [x] Add Homebrew install instructions for Mac/Linux -- [x] Mention that cagent comes pre-installed with Docker Desktop -- [x] Explain why `task build-local` is recommended on Windows (Docker Buildx cross-compilation) +## New Pages -## Quick Start -- [x] Clarify that Step 1 (install binary) and Step 2 (build from source) are two alternative options, not sequential steps -- [x] Add a path to get started from a built-in example (`cagent run examples/...`) or the default agent -- [x] Show running an agent from the registry (`cagent run agentcatalog/...`) +- [x] **Go SDK** — Not documented anywhere. The `examples/golibrary/` directory shows how to use cagent as a Go library. Needs a dedicated page. *(Completed: pages/guides/go-sdk.html)* +- [x] **Hooks** — `hooks` agent config (`pre_tool_use`, `post_tool_use`, `session_start`, `session_end`) is a significant feature with no documentation page. Covers running shell commands at various agent lifecycle points. *(Completed: pages/configuration/hooks.html)* +- [x] **Permissions** — Top-level `permissions` config with `allow`/`deny` glob patterns for tool call approval. Mentioned briefly in TUI page but has no dedicated reference. *(Completed: pages/configuration/permissions.html)* +- [x] **Sandbox Mode** — Shell tool `sandbox` config runs commands in Docker containers. Includes `image` and `paths` (bind mounts with `:ro` support). Not documented. *(Completed: pages/configuration/sandbox.html)* +- [x] **Structured Output** — Agent-level `structured_output` config (name, description, schema, strict). Forces model responses into a JSON schema. Not documented. *(Completed: pages/configuration/structured-output.html)* +- [x] **Model Routing** — Model-level `routing` config with rules that route requests to different models based on example phrases (rule-based router). *(Completed: pages/configuration/routing.html)* +- [x] **Custom Providers (top-level `providers` section)** — The `providers` top-level config key for defining reusable provider definitions with `api_type`, `base_url`, `token_key`. *(Completed: Added to pages/configuration/overview.html)* +- [x] **LSP Tool** — `type: lsp` toolset that provides Language Server Protocol integration (diagnostics, code actions, references, rename, etc.). Not documented anywhere. *(Completed: pages/tools/lsp.html)* +- [x] **User Prompt Tool** — `type: user_prompt` toolset that allows agents to ask the user for input mid-conversation. Not documented. *(Completed: pages/tools/user-prompt.html)* +- [x] **API Tool** — `api_config` on toolsets for defining HTTP API tools with endpoint, method, headers, args, and output_schema. Not documented. *(Completed: pages/tools/api.html)* +- [ ] **Branching Sessions** — TUI feature (v1.20.6) allowing editing previous messages to create branches. Mentioned in TUI page but could use more detail. -## Tools -- [x] Add missing built-in tools documentation (think, todo, memory, fetch, script, etc.) -- [x] Expand Tool Config page with more detailed content and examples for each tool type +## Missing Details in Existing Pages -## Features -- [x] Fix broken TUI demo GIF link on the home page +### Configuration > Agents (`agents.html`) -## Core Concepts -- [x] Move Agent Distribution from Features to Core Concepts (packaging & sharing is a fundamental concept) +- [x] `welcome_message` — Agent property not listed in schema or properties table *(Added)* +- [x] `handoffs` — Agent property for listing agents that can be handed off to (different from `sub_agents`). Not documented. *(Added)* +- [x] `add_prompt_files` — Agent property for including additional prompt files. *(Added)* +- [x] `add_description_parameter` — Agent property. *(Added)* +- [x] `code_mode_tools` — Agent property. *(Added)* +- [x] `hooks` — Agent property. Not shown in schema or properties table. *(Added with link to hooks page)* +- [x] `structured_output` — Agent property. Not shown in schema or properties table. *(Added with link to structured-output page)* +- [x] `defer` — Tool deferral configuration. *(Added with examples)* +- [x] `permissions` — Permission configuration. *(Added with link to permissions page)* +- [x] `sandbox` — Sandbox mode configuration. *(Added with link to sandbox page)* -## New Pages -- [ ] Remote runtime — remote runtime/server mode is not documented anywhere -- [ ] Changelog / What's New — could add a page or section sourced from `CHANGELOG.md` +### Configuration > Tools (`tools.html`) + +- [x] **LSP toolset** (`type: lsp`) — Language Server Protocol integration. *(Added with link to dedicated page)* +- [x] **User Prompt toolset** (`type: user_prompt`) — User input collection. *(Added with link to dedicated page)* +- [x] **API toolset** (`type: api`) — HTTP API tools. *(Added with link to dedicated page)* +- [x] **Handoff toolset** (`type: handoff`) — A2A agent delegation. *(Added)* +- [x] **A2A toolset** (`type: a2a`) — Toolset for connecting to remote A2A agents with `name` and `url`. *(Added)* +- [x] **Shared todo** (`shared: true`) — Todo toolset option for sharing todos across agents. *(Added)* +- [x] **Filesystem `post_edit`** — Post-edit commands that run after file edits (e.g., auto-format). *(Added)* +- [x] **Filesystem `ignore_vcs`** — Option to ignore VCS (.gitignore) files. *(Added)* +- [x] **Shell `env`** — Environment variables for shell/script/mcp/lsp tools. *(Added)* +- [x] **Fetch `timeout`** — Fetch tool timeout configuration. *(Added)* +- [x] **Script tool format** — Updated to show the correct `shell` map format with args, required, env, working_dir. *(Fixed)* +- [x] **MCP `config`** — The `config` field on MCP toolsets. *(Added)* + +### Configuration > Models (`models.html`) + +- [x] `track_usage` — Model property to track token usage. *(Added)* +- [x] `token_key` — Model property for specifying the env var holding the API token. *(Added)* +- [x] `routing` — Model property for rule-based routing. *(Added with link to routing page)* +- [x] `base_url` — Added examples showing how to use it with custom/self-hosted endpoints. *(Added)* + +### Configuration > Overview (`overview.html`) + +- [x] `metadata` — Top-level config section (author, license, description, readme, version). *(Added)* +- [x] `permissions` — Top-level config section. *(Added link to permissions page)* +- [x] `providers` — Top-level section. *(Added full documentation)* +- [x] Config `version` field — Current version is "5" but not documented what it means or how migration works. *(Added)* +- [x] **Advanced configuration cards** — Added cards linking to Hooks, Permissions, Sandbox, and Structured Output pages. *(Done)* + +### Features > CLI (`cli.html`) + +- [x] `--prompt-file` flag — Explanation of how it works (includes file contents as system context). *(Added)* +- [x] `--session` with relative references — e.g., `-1` for last session, `-2` for second to last. *(Added)* +- [x] Multi-turn conversations in `cagent exec` — Added example. *(Added)* +- [x] Queueing multiple messages: `cagent run question1 question2 ...` *(Added)* +- [x] `cagent eval` flags — Added examples with flags. *(Added)* +- [x] `cagent build` command — *(Added)* +- [ ] `--exit-on-stdin-eof` flag — Hidden flag, low priority. +- [ ] `--keep-containers` flag for eval — Already documented in eval page. + +### Features > TUI (`tui.html`) + +- [x] Ctrl+R reverse history search — *(Added with dedicated section)* +- [x] `/title` command for renaming sessions — *(Already documented)* +- [x] `/think` command to toggle thinking at runtime — *(Already documented)* +- [x] Custom themes and hot-reloading — *(Already documented)* +- [x] Ctrl+Z to suspend TUI — *(Added)* +- [x] Ctrl+L audio listening shortcut — *(Added)* +- [x] Ctrl+X to clear queued messages — *(Added)* +- [x] Permissions view dialog — *(Mentioned)* +- [x] Model picker / switching during session — *(Already documented)* +- [ ] Branching sessions (edit previous messages) — Mentioned but could have more detail. +- [ ] Double-click title to edit — Minor feature. + +### Features > Skills (`skills.html`) + +- [x] Skill invocation via slash commands — *(Added)* +- [x] Recursive `~/.agents/skills` directory support — *(Clarified in table)* + +### Features > Evaluation (`evaluation.html`) + +- [x] `--keep-containers` flag — *(Already documented)* +- [x] Session database produced for investigation — *(Added note)* +- [x] Debugging tip for failed evals — *(Added callout)* + +### Providers + +- [x] **Mistral** — Listed as built-in alias but has no dedicated page or usage examples. *(Completed: pages/providers/mistral.html)* +- [x] **xAI (Grok)** — *(Completed: pages/providers/xai.html)* +- [x] **Nebius** — *(Completed: pages/providers/nebius.html)* +- [x] **Ollama** — Can be used via custom providers. *(Completed: pages/providers/local.html - covers Ollama, vLLM, LocalAI)* + +### Features > RAG (`rag.html`) + +- [x] `respect_vcs` option — *(Added with default value)* +- [x] `return_full_content` results option — *(Added)* +- [ ] Code-aware chunking (`code_aware: true`) with tree-sitter — Partially documented, the option is shown in examples. + +### Community > Troubleshooting + +- [x] Common errors: context window exceeded, max iterations reached, model fallback behavior. *(Added)* +- [x] Debugging with `--debug` and `--log-file` — *(Already documented)* + +## Tips & Best Practices + +- [x] *(Completed: pages/guides/tips.html)* - Comprehensive tips page covering: + - [x] **Tip: Using `--yolo` mode** — Auto-approve all tool calls. Security implications and when it's appropriate. + - [x] **Tip: Environment variable interpolation in commands** — Commands support `${env.VAR}` and `${env.VAR || 'default'}` JavaScript template syntax. + - [x] **Tip: Fallback model strategy** — Best practices for choosing fallback models. + - [x] **Tip: Deferred tools for performance** — Use `defer: true` to load tools only when needed. + - [x] **Tip: Combining handoffs and sub_agents** — Explain the difference. + - [x] **Tip: Using the `auto` model** — The special `auto` model value for automatic model selection. + - [x] **Tip: Model aliases and pinning** — cagent automatically resolves model aliases to pinned versions. + - [x] **Tip: User-defined default model** — Users can define their own default model in global configuration. *(Added)* + - [x] **Tip: Usage on Github** - Example of the PR reviewer. *(Added)* + +## Navigation Updates + +- [x] Added Model Routing to Configuration section +- [x] Added xAI (Grok) and Nebius to Model Providers section +- [x] Added Guides section with Tips & Best Practices and Go SDK + +## Remaining Low-Priority Items + +- [ ] Branching sessions — More detailed documentation (currently mentioned) +- [ ] Double-click title to edit in TUI — Minor feature +- [ ] `--exit-on-stdin-eof` flag — Hidden flag for integration +- [ ] Code-aware chunking detail — Already shown in examples diff --git a/docs/js/app.js b/docs/js/app.js index c74cf22cb..9e14188ac 100644 --- a/docs/js/app.js +++ b/docs/js/app.js @@ -29,6 +29,19 @@ const NAV = [ { title: 'Agent Config', page: 'configuration/agents' }, { title: 'Model Config', page: 'configuration/models' }, { title: 'Tool Config', page: 'configuration/tools' }, + { title: 'Hooks', page: 'configuration/hooks' }, + { title: 'Permissions', page: 'configuration/permissions' }, + { title: 'Sandbox Mode', page: 'configuration/sandbox' }, + { title: 'Structured Output', page: 'configuration/structured-output' }, + { title: 'Model Routing', page: 'configuration/routing' }, + ], + }, + { + heading: 'Built-in Tools', + items: [ + { title: 'LSP Tool', page: 'tools/lsp' }, + { title: 'User Prompt Tool', page: 'tools/user-prompt' }, + { title: 'API Tool', page: 'tools/api' }, ], }, { @@ -55,9 +68,20 @@ const NAV = [ { title: 'Google Gemini', page: 'providers/google' }, { title: 'AWS Bedrock', page: 'providers/bedrock' }, { title: 'Docker Model Runner', page: 'providers/dmr' }, + { title: 'Mistral', page: 'providers/mistral' }, + { title: 'xAI (Grok)', page: 'providers/xai' }, + { title: 'Nebius', page: 'providers/nebius' }, + { title: 'Local Models', page: 'providers/local' }, { title: 'Custom Providers', page: 'providers/custom' }, ], }, + { + heading: 'Guides', + items: [ + { title: 'Tips & Best Practices', page: 'guides/tips' }, + { title: 'Go SDK', page: 'guides/go-sdk' }, + ], + }, { heading: 'Community', items: [ diff --git a/docs/pages/community/troubleshooting.html b/docs/pages/community/troubleshooting.html index 5c89eb10a..ef6e3d51f 100644 --- a/docs/pages/community/troubleshooting.html +++ b/docs/pages/community/troubleshooting.html @@ -1,6 +1,49 @@
Common issues and how to resolve them when working with cagent.
+Error message: context_length_exceeded or similar.
/compact in the TUI to summarize and reduce conversation historynum_history_items in agent config to limit messages sent to the modelThe agent hit its max_iterations limit without completing the task.
max_iterations in agent config (default is unlimited, but many agents set 20-50)--debug to see tool calls)When the primary model fails, cagent automatically switches to fallback models. Look for log messages like "Switching to fallback model".
Configure fallback behavior in your agent config:
+ +agents:
+ root:
+ model: anthropic/claude-sonnet-4-0
+ fallback:
+ models: [openai/gpt-4o, openai/gpt-4o-mini]
+ retries: 2 # retries per model for 5xx errors
+ cooldown: 1m # how long to stick with fallback after 429
+
The first step for any issue is enabling debug logging. This provides detailed information about what cagent is doing internally.
diff --git a/docs/pages/configuration/agents.html b/docs/pages/configuration/agents.html index 449703362..11984a7f1 100644 --- a/docs/pages/configuration/agents.html +++ b/docs/pages/configuration/agents.html @@ -8,20 +8,40 @@add_environment_infotrue, injects working directory, OS, CPU architecture, and git info into context.add_prompt_filesadd_description_parametertrue, adds agent descriptions as a parameter in tool schemas. Helps with tool selection in multi-agent scenarios.code_mode_toolstrue, formats tool responses in a code-optimized format with structured output schemas. Useful for MCP gateway and programmatic access.max_iterationscommandscagent run config.yaml /command_name.welcome_messagehandoffsdefertrue to defer all tools, or provide a list of tool names.hookspermissionssandboxstructured_outputDefault is 0 (unlimited). Always set max_iterations for agents with powerful tools like shell to prevent infinite loops. A value of 20–50 is typical for development agents.
Display a message when users start a session:
+ +agents:
+ assistant:
+ model: openai/gpt-4o
+ description: Development assistant
+ instruction: You are a helpful coding assistant.
+ welcome_message: |
+ 👋 Welcome! I'm your development assistant.
+
+ I can help you with:
+ - Writing and reviewing code
+ - Running tests and debugging
+ - Explaining concepts
+
+ What would you like to work on?
+
+Load tools on-demand to speed up agent startup:
+ +agents:
+ root:
+ model: anthropic/claude-sonnet-4-0
+ description: Multi-purpose assistant
+ instruction: You have access to many tools.
+ toolsets:
+ - type: mcp
+ ref: docker:github-official
+ - type: mcp
+ ref: docker:slack
+ - type: filesystem
+ # Defer all tools - load when first used
+ defer: true
+
+Or defer specific tools:
+ +agents:
+ root:
+ model: openai/gpt-4o
+ defer:
+ - "mcp:github:*" # Defer all GitHub tools
+ - "mcp:slack:*" # Defer all Slack tools
+
Automatically switch to backup models when the primary fails:
@@ -160,6 +266,7 @@Run shell commands at various points during agent execution for deterministic control over behavior.
+ +Hooks allow you to execute shell commands or scripts at key points in an agent's lifecycle. They provide deterministic control that works alongside the LLM's behavior, enabling validation, logging, environment setup, and more.
+ +There are four hook event types:
+ +| Event | When it fires | Can block? |
|---|---|---|
pre_tool_use | Before a tool call executes | Yes |
post_tool_use | After a tool completes successfully | No |
session_start | When a session begins or resumes | No |
session_end | When a session terminates | No |
agents:
+ root:
+ model: openai/gpt-4o
+ description: An agent with hooks
+ instruction: You are a helpful assistant.
+ hooks:
+ # Run before specific tools
+ pre_tool_use:
+ - matcher: "shell|edit_file"
+ hooks:
+ - type: command
+ command: "./scripts/validate-command.sh"
+ timeout: 30
+
+ # Run after all tool calls
+ post_tool_use:
+ - matcher: "*"
+ hooks:
+ - type: command
+ command: "./scripts/log-tool-call.sh"
+
+ # Run when session starts
+ session_start:
+ - type: command
+ command: "./scripts/setup-env.sh"
+
+ # Run when session ends
+ session_end:
+ - type: command
+ command: "./scripts/cleanup.sh"
+
+The matcher field uses regex patterns to match tool names:
| Pattern | Matches |
|---|---|
* | All tools |
shell | Only the shell tool |
shell|edit_file | Either shell or edit_file |
mcp:.* | All MCP tools (regex) |
Hooks receive JSON input via stdin with context about the event:
+ +{
+ "session_id": "abc123",
+ "cwd": "/path/to/project",
+ "hook_event_name": "pre_tool_use",
+ "tool_name": "shell",
+ "tool_use_id": "call_xyz",
+ "tool_input": {
+ "cmd": "rm -rf /tmp/cache",
+ "cwd": "."
+ }
+}
+
+| Field | pre_tool_use | post_tool_use | session_start | session_end |
|---|---|---|---|---|
session_id | ✓ | ✓ | ✓ | ✓ |
cwd | ✓ | ✓ | ✓ | ✓ |
hook_event_name | ✓ | ✓ | ✓ | ✓ |
tool_name | ✓ | ✓ | ||
tool_use_id | ✓ | ✓ | ||
tool_input | ✓ | ✓ | ||
tool_response | ✓ | |||
source | ✓ | |||
reason | ✓ |
The source field for session_start can be: startup, resume, clear, or compact.
The reason field for session_end can be: clear, logout, prompt_input_exit, or other.
Hooks communicate back via JSON output to stdout:
+ +{
+ "continue": true,
+ "stop_reason": "Optional message when continue=false",
+ "suppress_output": false,
+ "system_message": "Warning message to show user",
+ "decision": "allow",
+ "reason": "Explanation for the decision",
+ "hook_specific_output": {
+ "hook_event_name": "pre_tool_use",
+ "permission_decision": "allow",
+ "permission_decision_reason": "Command is safe",
+ "updated_input": { "cmd": "modified command" }
+ }
+}
+
+| Field | Type | Description |
|---|---|---|
continue | boolean | Whether to continue execution (default: true) |
stop_reason | string | Message to show when continue=false |
suppress_output | boolean | Hide stdout from transcript |
system_message | string | Warning message to display to user |
decision | string | For blocking: block to prevent operation |
reason | string | Explanation for the decision |
The hook_specific_output for pre_tool_use supports:
| Field | Type | Description |
|---|---|---|
permission_decision | string | allow, deny, or ask |
permission_decision_reason | string | Explanation for the decision |
updated_input | object | Modified tool input (replaces original) |
Hook exit codes have special meaning:
+ +| Exit Code | Meaning |
|---|---|
0 | Success — continue normally |
2 | Blocking error — stop the operation |
| Other | Error — logged but execution continues |
A simple pre-tool-use hook that blocks dangerous shell commands:
+ +#!/bin/bash
+# scripts/validate-command.sh
+
+# Read JSON input from stdin
+INPUT=$(cat)
+TOOL_NAME=$(echo "$INPUT" | jq -r '.tool_name')
+CMD=$(echo "$INPUT" | jq -r '.tool_input.cmd // empty')
+
+# Block dangerous commands
+if [[ "$TOOL_NAME" == "shell" ]]; then
+ if [[ "$CMD" =~ ^sudo ]] || [[ "$CMD" =~ rm.*-rf ]]; then
+ echo '{"decision": "block", "reason": "Dangerous command blocked by policy"}'
+ exit 2
+ fi
+fi
+
+# Allow everything else
+echo '{"decision": "allow"}'
+exit 0
+
+A post-tool-use hook that logs all tool calls:
+ +#!/bin/bash
+# scripts/log-tool-call.sh
+
+INPUT=$(cat)
+TIMESTAMP=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
+TOOL_NAME=$(echo "$INPUT" | jq -r '.tool_name')
+SESSION_ID=$(echo "$INPUT" | jq -r '.session_id')
+
+# Append to audit log
+echo "$TIMESTAMP | $SESSION_ID | $TOOL_NAME" >> ./audit.log
+
+# Don't block execution
+echo '{"continue": true}'
+exit 0
+
+Hooks have a default timeout of 60 seconds. You can customize this per hook:
+ +hooks:
+ pre_tool_use:
+ - matcher: "*"
+ hooks:
+ - type: command
+ command: "./slow-validation.sh"
+ timeout: 120 # 2 minutes
+
+Hooks run synchronously and can slow down agent execution. Keep hook scripts fast and efficient. Consider using suppress_output: true for logging hooks to reduce noise.
top_pfrequency_penaltypresence_penaltybase_urlbase_urltoken_keythinking_budgetparallel_tool_callstrack_usageroutingprovider_optsFor detailed provider setup, see the Model Providers section.
+ +Use base_url to point to custom or self-hosted endpoints:
models:
+ # Azure OpenAI
+ azure_gpt:
+ provider: openai
+ model: gpt-4o
+ base_url: https://my-resource.openai.azure.com/openai/deployments/gpt-4o
+ token_key: AZURE_OPENAI_API_KEY
+
+ # Self-hosted vLLM
+ local_llama:
+ provider: openai # vLLM is OpenAI-compatible
+ model: meta-llama/Llama-3.2-3B-Instruct
+ base_url: http://localhost:8000/v1
+
+ # Proxy or gateway
+ proxied:
+ provider: openai
+ model: gpt-4o
+ base_url: https://proxy.internal.company.com/openai/v1
+ token_key: INTERNAL_API_KEY
+
+See Local Models for more examples of custom endpoints.
diff --git a/docs/pages/configuration/overview.html b/docs/pages/configuration/overview.html index 8263fc310..fb7607d77 100644 --- a/docs/pages/configuration/overview.html +++ b/docs/pages/configuration/overview.html @@ -3,16 +3,25 @@A cagent YAML config has three main sections:
+A cagent YAML config has these main sections:
-# 1. Models — define AI models with their parameters
+# 1. Version — configuration schema version (optional but recommended)
+version: 5
+
+# 2. Metadata — optional agent metadata for distribution
+metadata:
+ author: my-org
+ description: My helpful agent
+ version: "1.0.0"
+
+# 3. Models — define AI models with their parameters
models:
claude:
provider: anthropic
model: claude-sonnet-4-0
max_tokens: 64000
-# 2. Agents — define AI agents with their behavior
+# 4. Agents — define AI agents with their behavior
agents:
root:
model: claude
@@ -21,12 +30,25 @@ File Structure
toolsets:
- type: think
-# 3. Providers — optional custom provider definitions
+# 5. RAG — define retrieval-augmented generation sources (optional)
+rag:
+ docs:
+ docs: ["./docs"]
+ strategies:
+ - type: chunked-embeddings
+ model: openai/text-embedding-3-small
+
+# 6. Providers — optional custom provider definitions
providers:
my_provider:
api_type: openai_chatcompletions
base_url: https://api.example.com/v1
- token_key: MY_API_KEY
+ token_key: MY_API_KEY
+
+# 7. Permissions — global tool permission rules (optional)
+permissions:
+ allow: ["read_*"]
+ deny: ["shell:cmd=sudo*"]
All agent properties: model, instruction, tools, sub-agents, and more.
+All agent properties: model, instruction, tools, sub-agents, permissions, hooks, and more.
@@ -71,7 +93,32 @@Built-in tools, MCP tools, Docker MCP, and tool filtering.
+Built-in tools, MCP tools, Docker MCP, LSP, API tools, and tool filtering.
+ + + +Run shell commands at lifecycle events like tool calls and session start/end.
+ + + +Control which tools auto-approve, require confirmation, or are blocked.
+ + + +Run shell commands in an isolated Docker container for security.
+ + + +Constrain agent responses to match a specific JSON schema.
For editor autocompletion and validation, use the cagent JSON Schema. Add this to the top of your YAML file:
# yaml-language-server: $schema=https://raw.githubusercontent.com/docker/cagent/main/cagent-schema.json
+
+cagent configs are versioned. The current version is 5. Add the version at the top of your config:
version: 5
+
+agents:
+ root:
+ model: openai/gpt-4o
+ # ...
+
+When you load an older config, cagent automatically migrates it to the latest schema. It's recommended to include the version to ensure consistent behavior.
+ +Optional metadata for agent distribution via OCI registries:
+ +metadata:
+ author: my-org
+ license: Apache-2.0
+ description: A helpful coding assistant
+ readme: | # Displayed in registries
+ This agent helps with coding tasks.
+ version: "1.0.0"
+
+| Field | Description |
|---|---|
author | Author or organization name |
license | License identifier (e.g., Apache-2.0, MIT) |
description | Short description for the agent |
readme | Longer markdown description |
version | Semantic version string |
See Agent Distribution for publishing agents to registries.
+ +Define reusable provider configurations for custom or self-hosted endpoints:
+ +providers:
+ azure:
+ api_type: openai_chatcompletions
+ base_url: https://my-resource.openai.azure.com/openai/deployments/gpt-4o
+ token_key: AZURE_OPENAI_API_KEY
+
+ internal_llm:
+ api_type: openai_chatcompletions
+ base_url: https://llm.internal.company.com/v1
+ token_key: INTERNAL_API_KEY
+
+models:
+ azure_gpt:
+ provider: azure # References the custom provider
+ model: gpt-4o
+
+agents:
+ root:
+ model: azure_gpt
+
+| Field | Description |
|---|---|
api_type | API schema: openai_chatcompletions (default) or openai_responses |
base_url | Base URL for the API endpoint |
token_key | Environment variable name for the API token |
See Custom Providers for more details.
diff --git a/docs/pages/configuration/permissions.html b/docs/pages/configuration/permissions.html new file mode 100644 index 000000000..3715dd001 --- /dev/null +++ b/docs/pages/configuration/permissions.html @@ -0,0 +1,181 @@ +Control which tools can execute automatically, require confirmation, or are blocked entirely.
+ +Permissions provide fine-grained control over tool execution. You can configure which tools are auto-approved (run without asking), which require user confirmation, and which are completely blocked.
+ +Permissions are evaluated in this order: Deny → Allow → Ask. Deny patterns take priority, then allow patterns, and anything else defaults to asking for user confirmation.
+agents:
+ root:
+ model: openai/gpt-4o
+ description: Agent with permission controls
+ instruction: You are a helpful assistant.
+ permissions:
+ # Auto-approve these tools (no confirmation needed)
+ allow:
+ - "read_file"
+ - "read_*" # Glob patterns
+ - "shell:cmd=ls*" # With argument matching
+
+ # Block these tools entirely
+ deny:
+ - "shell:cmd=sudo*"
+ - "shell:cmd=rm*-rf*"
+ - "dangerous_tool"
+
+Permissions support glob-style patterns with optional argument matching:
+ +| Pattern | Matches |
|---|---|
shell | Exact match for shell tool |
read_* | Any tool starting with read_ |
mcp:github:* | Any GitHub MCP tool |
* | All tools |
You can match tools based on their argument values using tool:arg=pattern syntax:
permissions:
+ allow:
+ # Allow shell only when cmd starts with "ls" or "cat"
+ - "shell:cmd=ls*"
+ - "shell:cmd=cat*"
+
+ # Allow edit_file only in specific directory
+ - "edit_file:path=/home/user/safe/*"
+
+ deny:
+ # Block shell with sudo
+ - "shell:cmd=sudo*"
+
+ # Block writes to system directories
+ - "write_file:path=/etc/*"
+ - "write_file:path=/usr/*"
+
+Chain multiple argument conditions with colons. All conditions must match:
+ +permissions:
+ allow:
+ # Allow shell with ls in current directory
+ - "shell:cmd=ls*:cwd=."
+
+ deny:
+ # Block shell with rm -rf anywhere
+ - "shell:cmd=rm*:cmd=*-rf*"
+
+Patterns follow filepath.Match semantics with some extensions:
+ +* — matches any sequence of characters (including spaces)? — matches any single character[abc] — matches any character in the set[a-z] — matches any character in the rangeMatching is case-insensitive.
+ +Trailing wildcards like sudo* match any characters including spaces, so sudo* matches sudo rm -rf /.
| Decision | Behavior |
|---|---|
| Allow | Tool executes immediately without user confirmation |
| Ask | User must confirm before tool executes (default) |
| Deny | Tool is blocked and returns an error to the agent |
Allow all read operations, block all writes:
+ +permissions:
+ allow:
+ - "read_file"
+ - "read_multiple_files"
+ - "list_directory"
+ - "directory_tree"
+ - "search_files_content"
+ deny:
+ - "write_file"
+ - "edit_file"
+ - "shell"
+
+Allow specific safe commands, block dangerous ones:
+ +permissions:
+ allow:
+ - "shell:cmd=ls*"
+ - "shell:cmd=cat*"
+ - "shell:cmd=grep*"
+ - "shell:cmd=find*"
+ - "shell:cmd=head*"
+ - "shell:cmd=tail*"
+ - "shell:cmd=wc*"
+ deny:
+ - "shell:cmd=sudo*"
+ - "shell:cmd=rm*"
+ - "shell:cmd=mv*"
+ - "shell:cmd=chmod*"
+ - "shell:cmd=chown*"
+
+Control MCP tools by their qualified names:
+ +permissions:
+ allow:
+ # Allow all GitHub read operations
+ - "mcp:github:get_*"
+ - "mcp:github:list_*"
+ - "mcp:github:search_*"
+ deny:
+ # Block destructive GitHub operations
+ - "mcp:github:delete_*"
+ - "mcp:github:close_*"
+
+Permissions work alongside hooks. The evaluation order is:
+ +Hooks can override allow decisions but cannot override deny decisions.
+ +Permissions are enforced client-side. They help prevent accidental operations but should not be relied upon as a security boundary for untrusted agents. For stronger isolation, use sandbox mode.
+Route requests to different models based on the content of user messages.
+ +Model routing lets you define a "router" model that automatically selects the best underlying model based on the user's message. This is useful for cost optimization, specialized handling, or load balancing across models.
+ +cagent uses NLP-based text similarity (via Bleve full-text search) to match user messages against example phrases you define. The route with the best-matching examples wins, and that model handles the request.
+Add routing rules to any model definition. The model's provider/model fields become the fallback when no route matches:
models:
+ smart_router:
+ # Fallback model when no routing rule matches
+ provider: openai
+ model: gpt-4o-mini
+
+ # Routing rules
+ routing:
+ - model: anthropic/claude-sonnet-4-0
+ examples:
+ - "Write a detailed technical document"
+ - "Help me architect this system"
+ - "Review this code for security issues"
+ - "Explain this complex algorithm"
+
+ - model: openai/gpt-4o
+ examples:
+ - "Generate some creative ideas"
+ - "Write a story about"
+ - "Help me brainstorm"
+ - "Come up with names for"
+
+ - model: openai/gpt-4o-mini
+ examples:
+ - "What time is it"
+ - "Convert this to JSON"
+ - "Simple math calculation"
+ - "Translate this word"
+
+agents:
+ root:
+ model: smart_router
+ description: Assistant with intelligent model routing
+ instruction: You are a helpful assistant.
+
+Each routing rule has:
+ +| Field | Type | Required | Description |
|---|---|---|---|
model | string | ✓ | Target model (inline format or reference to models section) |
examples | array | ✓ | Example phrases that should route to this model |
The router:
+ +Route simple queries to cheaper models:
+ +models:
+ cost_optimizer:
+ provider: openai
+ model: gpt-4o-mini # Cheap fallback
+ routing:
+ - model: anthropic/claude-sonnet-4-0
+ examples:
+ - "Complex analysis"
+ - "Detailed research"
+ - "Multi-step reasoning"
+
+Route coding tasks to code-specialized models:
+ +models:
+ task_router:
+ provider: openai
+ model: gpt-4o # General fallback
+ routing:
+ - model: anthropic/claude-sonnet-4-0
+ examples:
+ - "Write code"
+ - "Debug this function"
+ - "Review my implementation"
+ - "Fix this bug"
+ - model: openai/gpt-4o
+ examples:
+ - "Write a blog post"
+ - "Help me with writing"
+ - "Summarize this document"
+
+Distribute load across equivalent models from different providers:
+ +models:
+ load_balancer:
+ provider: openai
+ model: gpt-4o
+ routing:
+ - model: anthropic/claude-sonnet-4-0
+ examples:
+ - "First request pattern"
+ - "Another request type"
+ - model: google/gemini-2.5-flash
+ examples:
+ - "Different request pattern"
+ - "Alternative query style"
+
+Enable debug logging to see routing decisions:
+ +$ cagent run config.yaml --debug
+
+Look for log entries like:
+ +"Rule-based router selected model" router=smart_router selected_model=anthropic/claude-sonnet-4-0
+"Route matched" model=anthropic/claude-sonnet-4-0 score=2.45
+
+Run shell commands in an isolated Docker container for enhanced security.
+ +Sandbox mode runs shell tool commands inside a Docker container instead of directly on the host system. This provides an additional layer of isolation, limiting the potential impact of unintended or malicious commands.
+ +Sandbox mode requires Docker to be installed and running on the host system.
+agents:
+ root:
+ model: openai/gpt-4o
+ description: Agent with sandboxed shell
+ instruction: You are a helpful assistant.
+ toolsets:
+ - type: shell
+ sandbox:
+ image: alpine:latest # Docker image to use
+ paths: # Directories to mount
+ - "." # Current directory (read-write)
+ - "/data:ro" # Read-only mount
+
+| Property | Type | Default | Description |
|---|---|---|---|
image | string | alpine:latest | Docker image to use for the sandbox container |
paths | array | [] | Host paths to mount into the container |
Paths can be specified with optional access modes:
+ +| Format | Description |
|---|---|
/path | Mount with read-write access (default) |
/path:rw | Explicitly read-write |
/path:ro | Read-only mount |
. | Current working directory |
./relative | Relative path (resolved from working directory) |
Paths are mounted at the same location inside the container as on the host, so file paths in commands work the same way.
+ +agents:
+ developer:
+ model: anthropic/claude-sonnet-4-0
+ description: Development agent with sandboxed shell
+ instruction: |
+ You are a software developer. Use the shell tool to run
+ build commands and tests. Your shell runs in a sandbox.
+ toolsets:
+ - type: shell
+ - type: filesystem
+ sandbox:
+ image: node:20-alpine # Node.js environment
+ paths:
+ - "." # Project directory
+ - "/tmp:rw" # Temp directory for builds
+
+docker execSandbox containers are started with these Docker options:
+ +--rm — Automatically remove when stopped--init — Use init process for proper signal handling--network host — Share host network (commands can access network)If cagent crashes or is killed, sandbox containers may be left running. cagent automatically cleans up orphaned containers from previous runs when it starts. Containers are identified by labels and the PID of the cagent process that created them.
+ +Select a Docker image that has the tools your agent needs:
+ +| Use Case | Suggested Image |
|---|---|
| General scripting | alpine:latest |
| Node.js development | node:20-alpine |
| Python development | python:3.12-alpine |
| Go development | golang:1.23-alpine |
| Full Linux environment | ubuntu:24.04 |
For complex setups, build a custom Docker image with all required tools pre-installed. This avoids installation time during agent execution.
+shell tool runs in the sandbox; other tools (filesystem, MCP) run on the hostFor defense in depth, combine sandbox mode with permissions:
+ +agents:
+ root:
+ model: openai/gpt-4o
+ description: Secure development agent
+ instruction: You are a helpful assistant.
+ toolsets:
+ - type: shell
+ - type: filesystem
+ sandbox:
+ image: node:20-alpine
+ paths:
+ - ".:rw"
+ permissions:
+ allow:
+ - "shell:cmd=npm*"
+ - "shell:cmd=node*"
+ - "shell:cmd=ls*"
+ deny:
+ - "shell:cmd=sudo*"
+ - "shell:cmd=curl*"
+ - "shell:cmd=wget*"
diff --git a/docs/pages/configuration/structured-output.html b/docs/pages/configuration/structured-output.html
new file mode 100644
index 000000000..0445ed56a
--- /dev/null
+++ b/docs/pages/configuration/structured-output.html
@@ -0,0 +1,213 @@
+Force the agent to respond with JSON matching a specific schema.
+ +Structured output constrains the agent's responses to match a predefined JSON schema. This is useful for building agents that need to produce machine-readable output for downstream processing, API responses, or integration with other systems.
+ +agents:
+ analyzer:
+ model: openai/gpt-4o
+ description: Code analyzer that outputs structured results
+ instruction: |
+ Analyze the provided code and identify issues.
+ Return your findings in the structured format.
+ structured_output:
+ name: analysis_result
+ description: Code analysis findings
+ strict: true
+ schema:
+ type: object
+ properties:
+ issues:
+ type: array
+ items:
+ type: object
+ properties:
+ severity:
+ type: string
+ enum: ["error", "warning", "info"]
+ line:
+ type: integer
+ message:
+ type: string
+ required: ["severity", "line", "message"]
+ summary:
+ type: string
+ required: ["issues", "summary"]
+
+| Property | Type | Required | Description |
|---|---|---|---|
name | string | ✓ | Name identifier for the output schema |
description | string | ✗ | Description of what the output represents |
strict | boolean | ✗ | Enforce strict schema validation (default: false) |
schema | object | ✓ | JSON Schema defining the output structure |
The schema follows JSON Schema specification. Common schema types:
+ +schema:
+ type: object
+ properties:
+ name:
+ type: string
+ count:
+ type: integer
+ active:
+ type: boolean
+ required: ["name", "count"]
+
+schema:
+ type: object
+ properties:
+ items:
+ type: array
+ items:
+ type: object
+ properties:
+ id:
+ type: string
+ value:
+ type: number
+ required: ["id", "value"]
+ required: ["items"]
+
+schema:
+ type: object
+ properties:
+ status:
+ type: string
+ enum: ["pending", "approved", "rejected"]
+ priority:
+ type: string
+ enum: ["low", "medium", "high", "critical"]
+ required: ["status"]
+
+When strict: true, the model is constrained to only produce output that exactly matches the schema. This provides stronger guarantees but may limit the model's flexibility.
Model aims to match schema but may include additional fields or slight variations.
+Model output is constrained to exactly match the schema. Stronger guarantees.
+Structured output support varies by provider:
+ +| Provider | Support | Notes |
|---|---|---|
| OpenAI | ✓ Full | Native JSON mode with schema validation |
| Anthropic | ✓ Full | Tool-based structured output |
| Google Gemini | ✓ Full | Native JSON mode |
| AWS Bedrock | ✓ Partial | Depends on underlying model |
| DMR | ⚠️ Limited | Depends on model capabilities |
agents:
+ extractor:
+ model: openai/gpt-4o
+ description: Extract structured data from text
+ instruction: |
+ Extract contact information from the provided text.
+ Return all found contacts in the structured format.
+ structured_output:
+ name: contacts
+ description: Extracted contact information
+ strict: true
+ schema:
+ type: object
+ properties:
+ contacts:
+ type: array
+ items:
+ type: object
+ properties:
+ name:
+ type: string
+ description: Full name of the contact
+ email:
+ type: string
+ description: Email address
+ phone:
+ type: string
+ description: Phone number
+ company:
+ type: string
+ description: Company or organization
+ required: ["name"]
+ total_found:
+ type: integer
+ description: Total number of contacts found
+ required: ["contacts", "total_found"]
+
+agents:
+ classifier:
+ model: anthropic/claude-sonnet-4-0
+ description: Classify support tickets
+ instruction: |
+ Classify the support ticket into the appropriate category
+ and priority level based on its content.
+ structured_output:
+ name: ticket_classification
+ strict: true
+ schema:
+ type: object
+ properties:
+ category:
+ type: string
+ enum: ["billing", "technical", "account", "feature_request", "other"]
+ priority:
+ type: string
+ enum: ["low", "medium", "high", "urgent"]
+ confidence:
+ type: number
+ minimum: 0
+ maximum: 1
+ description: Confidence score between 0 and 1
+ reasoning:
+ type: string
+ description: Brief explanation for the classification
+ required: ["category", "priority", "confidence"]
+
+When using structured output, the agent typically cannot use tools since its response format is constrained to the schema. Design your agent workflow accordingly — structured output agents work best for single-turn analysis or extraction tasks.
+Read, write, list, search, and navigate files in the working directory.
toolsets:
- - type: filesystem
+ - type: filesystem
+ ignore_vcs: false # Optional: ignore .gitignore files
+ post_edit: # Optional: run commands after file edits
+ - path: "*.go"
+ cmd: "gofmt -w ${file}"
| Operation | Description |
|---|
| Property | Type | Default | Description |
|---|---|---|---|
ignore_vcs | boolean | false | When true, ignores .gitignore patterns and includes all files |
post_edit | array | [] | Commands to run after editing files matching a path pattern |
post_edit[].path | string | — | Glob pattern for files (e.g., *.go, src/**/*.ts) |
post_edit[].cmd | string | — | Command to run (use ${file} for the edited file path) |
The filesystem tool resolves paths relative to the working directory. Agents can also use absolute paths.
@@ -31,10 +45,21 @@Execute arbitrary shell commands. Each call runs in a fresh, isolated shell session — no state persists between calls.
toolsets:
- - type: shell
+ - type: shell
+ env: # Optional: environment variables
+ MY_VAR: "value"
+ PATH: "${PATH}:/custom/bin"
The agent has access to the full system shell and environment variables. Commands have a default 30-second timeout. Requires user confirmation unless --yolo is used.
| Property | Type | Description |
|---|---|---|
env | object | Environment variables to set for all shell commands |
sandbox | object | Run commands in a Docker container. See Sandbox Mode. |
Step-by-step reasoning scratchpad. The agent writes its thoughts without producing visible output — ideal for planning, decomposition, and decision-making.
toolsets:
@@ -45,7 +70,8 @@ Think
Todo
Task list management. Agents can create, update, and track tasks with status (pending, in-progress, completed).
toolsets:
- - type: todo
+ - type: todo
+ shared: false # Optional: share todos across agents
| Operation | Description |
|---|
| Property | Type | Default | Description |
|---|---|---|---|
shared | boolean | false | When true, todos are shared across all agents in a multi-agent config |
Persistent key-value storage backed by SQLite. Data survives across sessions, letting agents remember context, user preferences, and past decisions.
toolsets:
@@ -73,37 +106,152 @@ Memory
Fetch
Make HTTP requests to external APIs and web services.
toolsets:
- - type: fetch
+ - type: fetch
+ timeout: 30 # Optional: request timeout in seconds
Supports GET, POST, PUT, DELETE, and other HTTP methods. The agent can set headers, send request bodies, and receive response data. Useful for calling REST APIs, reading web pages, and downloading content.
+| Property | Type | Default | Description |
|---|---|---|---|
timeout | int | 30 | Request timeout in seconds |
Define custom shell scripts as named tools. Unlike the generic shell tool, scripts are predefined and can be given descriptive names — ideal for exposing safe, well-scoped operations.
Simple format:
toolsets:
- type: script
- scripts:
- - name: run_tests
+ shell:
+ run_tests:
+ cmd: task test
description: Run the project test suite
- command: task test
- - name: lint
+ lint:
+ cmd: task lint
description: Run the linter
- command: task lint
- - name: deploy_staging
- description: Deploy to the staging environment
- command: ./scripts/deploy.sh staging
+ deploy:
+ cmd: ./scripts/deploy.sh ${env}
+ description: Deploy to an environment
+ args:
+ env:
+ type: string
+ enum: [staging, production]
+ required: [env]
| Property | Type | Description |
|---|---|---|
scripts[].name | string | Tool name the agent sees and calls |
scripts[].description | string | Description shown to the model for tool selection |
scripts[].command | string | Shell command to execute when the tool is called |
shell.<name>.cmd | string | Shell command to execute (supports ${arg} interpolation) |
shell.<name>.description | string | Description shown to the model |
shell.<name>.args | object | Parameter definitions (JSON Schema properties) |
shell.<name>.required | array | Required parameter names |
shell.<name>.env | object | Environment variables for this script |
shell.<name>.working_dir | string | Working directory for script execution |
The transfer_task tool is automatically available when an agent has sub_agents. Allows delegating tasks to sub-agents. No configuration needed — it's enabled implicitly.
Connect to language servers for code intelligence: go-to-definition, find references, diagnostics, and more.
+toolsets:
+ - type: lsp
+ command: gopls
+ args: []
+ file_types: [".go"]
+
+| Property | Type | Description |
|---|---|---|
command | string | LSP server executable command |
args | array | Command-line arguments for the LSP server |
env | object | Environment variables for the LSP process |
file_types | array | File extensions this LSP handles |
See LSP Tool for full documentation.
+ +Ask users questions and collect interactive input during agent execution.
+toolsets:
+ - type: user_prompt
+
+The agent can use this tool to ask questions, present choices, or collect information from the user. Supports JSON Schema for structured input validation.
+ +See User Prompt Tool for full documentation.
+ +Create custom tools that call HTTP APIs without writing code.
+toolsets:
+ - type: api
+ name: get_weather
+ method: GET
+ endpoint: "https://api.weather.example/v1/current?city=${city}"
+ instruction: Get current weather for a city
+ args:
+ city:
+ type: string
+ description: City name
+ required: ["city"]
+ headers:
+ Authorization: "Bearer ${env.WEATHER_API_KEY}"
+
+| Property | Type | Description |
|---|---|---|
name | string | Tool name |
method | string | HTTP method: GET or POST |
endpoint | string | URL with ${param} interpolation |
args | object | Parameter definitions |
required | array | Required parameter names |
headers | object | HTTP headers (supports ${env.VAR}) |
See API Tool for full documentation.
+ +Delegate tasks to remote agents via the A2A (Agent-to-Agent) protocol.
+toolsets:
+ - type: handoff
+ name: research_agent
+ description: Specialized research agent
+ url: "http://localhost:8080/a2a"
+ timeout: 5m
+
+| Property | Type | Description |
|---|---|---|
name | string | Tool name for delegation |
description | string | Description for the agent |
url | string | A2A server endpoint URL |
timeout | string | Request timeout (default: 5m) |
See A2A Protocol for full documentation.
+ +Connect to remote agents via the A2A protocol. Similar to handoff but configured as a toolset.
+toolsets:
+ - type: a2a
+ name: research_agent
+ url: "http://localhost:8080/a2a"
+
+| Property | Type | Description |
|---|---|---|
name | string | Tool name for the remote agent |
url | string | A2A server endpoint URL |
See A2A Protocol for full documentation.
+Extend agents with external tools via the Model Context Protocol.
@@ -128,6 +276,7 @@refdocker:name)toolsinstructionconfigcagent run-a, --agent <name>--yolo--model <ref>provider/model for all agents, or agent=provider/model for specific agents. Comma-separate multiple overrides.--session <id>-1, -2)--prompt-file <path>--session <id>-1 = last, -2 = second to last)--prompt-file <path>-c <name>-d, --debug--log-file <path>cagent runcagent execRun an agent in non-interactive (headless) mode. No TUI — output goes to stdout.
@@ -116,7 +120,21 @@cagent push / cagent pullcagent evalRun agent evaluations.
-$ cagent eval eval-config.yaml
+$ cagent eval eval-config.yaml
+
+# With flags
+$ cagent eval agent.yaml ./evals -c 8 # 8 concurrent evaluations
+$ cagent eval agent.yaml --keep-containers # Keep containers for debugging
+$ cagent eval agent.yaml --only "auth*" # Only run matching evals
+
+cagent buildBuild agent configuration into a distributable format.
+ +$ cagent build [config] [flags]
+
+# Examples
+$ cagent build agent.yaml
+$ cagent build agent.yaml -o ./dist
cagent aliasManage agent aliases for quick access.
diff --git a/docs/pages/features/evaluation.html b/docs/pages/features/evaluation.html index 0bb6985b8..bca8a52e6 100644 --- a/docs/pages/features/evaluation.html +++ b/docs/pages/features/evaluation.html @@ -195,10 +195,16 @@Use --keep-containers to preserve containers after evaluation. You can then inspect them with docker exec to understand why an eval failed. The session database (.db file) contains the full conversation history for each eval.
$ cagent eval demo.yaml ./evals
✓ Counting Files in Local Folder
diff --git a/docs/pages/features/rag.html b/docs/pages/features/rag.html
index 2b7cbc4ae..378d5aa8a 100644
--- a/docs/pages/features/rag.html
+++ b/docs/pages/features/rag.html
@@ -168,13 +168,13 @@ Configuration Reference
Top-Level RAG Fields
- Field Type Description
+ Field Type Default Description
- docs[]string Document paths/directories (shared across strategies)
- descriptionstring Human-readable description of this RAG source
- respect_vcsboolean Respect .gitignore files (default: true)
- strategies[]object Array of retrieval strategy configurations
- resultsobject Post-processing: fusion, reranking, deduplication, final limit
+ docs[]string — Document paths/directories (shared across strategies)
+ descriptionstring — Human-readable description of this RAG source
+ respect_vcsboolean trueRespect .gitignore files when indexing documents
+ strategies[]object — Array of retrieval strategy configurations
+ resultsobject — Post-processing: fusion, reranking, deduplication, final limit
@@ -239,8 +239,10 @@ Results (Post-Processing)
fusion.strategystring rrfFusion method: rrf, weighted, or max
fusion.kint 60RRF rank constant
- deduplicatebool falseRemove duplicate results
- limitint 5Final number of results
+ deduplicatebool trueRemove duplicate results
+ limitint 15Final number of results
+ include_scorebool falseInclude relevance scores in results
+ return_full_contentbool falseReturn full document content instead of just matched chunks
reranking.modelstring — Reranking model reference
reranking.top_kint (all) Only rerank top K results
reranking.thresholdfloat 0.5Minimum relevance score after reranking
diff --git a/docs/pages/features/skills.html b/docs/pages/features/skills.html
index f8204bd54..d2b39fce4 100644
--- a/docs/pages/features/skills.html
+++ b/docs/pages/features/skills.html
@@ -53,9 +53,9 @@ Global
Path Search Type
- ~/.codex/skills/Recursive
- ~/.claude/skills/Flat
- ~/.agents/skills/Recursive
+ ~/.codex/skills/Recursive (searches all subdirectories)
+ ~/.claude/skills/Flat (immediate children only)
+ ~/.agents/skills/Recursive (searches all subdirectories)
@@ -68,6 +68,22 @@ Project (from git root to current directory)
+Invoking Skills
+
+Skills can be invoked in multiple ways:
+
+
+ - Automatic: The agent detects when your request matches a skill's description and loads it automatically
+ - Explicit: Reference the skill name in your prompt: "Use the create-dockerfile skill to..."
+ - Slash command: Use
/{skill-name} to invoke a skill directly
+
+
+# In the TUI, invoke skill directly:
+/create-dockerfile
+
+# Or mention it in your message:
+"Create a dockerfile for my Python app (use the create-dockerfile skill)"
+
Precedence
When multiple skills share the same name:
diff --git a/docs/pages/features/tui.html b/docs/pages/features/tui.html
index ac234cc2a..efcc62063 100644
--- a/docs/pages/features/tui.html
+++ b/docs/pages/features/tui.html
@@ -120,13 +120,20 @@ Keyboard Shortcuts
Ctrl+K Open command palette
Ctrl+M Switch model
- Ctrl+R Reverse history search
- Ctrl+L Audio listening mode
- Ctrl+Z Suspend to background
+ Ctrl+R Reverse history search (search previous inputs)
+ Ctrl+L Start audio listening mode (voice input)
+ Ctrl+Z Suspend TUI to background (resume with fg)
+ Ctrl+X Clear queued messages
Escape Cancel current operation
+ Enter Send message (or newline with Shift+Enter)
+ Up/Down Navigate message history
+History Search
+
+Press Ctrl+R to enter incremental history search mode. Start typing to filter through your previous inputs. Press Enter to select a match, or Escape to cancel.
+
Theming
Customize the TUI appearance with built-in or custom themes:
diff --git a/docs/pages/guides/go-sdk.html b/docs/pages/guides/go-sdk.html
new file mode 100644
index 000000000..f9db9b769
--- /dev/null
+++ b/docs/pages/guides/go-sdk.html
@@ -0,0 +1,344 @@
+Go SDK
+Use cagent as a Go library to embed AI agents in your applications.
+
+Overview
+
+cagent can be used as a Go library, allowing you to build AI agents directly into your Go applications. This gives you full programmatic control over agent creation, tool integration, and execution.
+
+
+ ℹ️ Import Path
+ import "github.com/docker/cagent/pkg/..."
+
+
+Core Packages
+
+
+ Package Purpose
+
+ pkg/agentAgent creation and configuration
+ pkg/runtimeAgent execution and event streaming
+ pkg/sessionConversation state management
+ pkg/teamMulti-agent team composition
+ pkg/toolsTool interface and utilities
+ pkg/tools/builtinBuilt-in tools (shell, filesystem, etc.)
+ pkg/model/provider/*Model provider clients
+ pkg/config/latestConfiguration types
+ pkg/environmentEnvironment and secrets
+
+
+
+Basic Example
+
+Create a simple agent and run it:
+
+package main
+
+import (
+ "context"
+ "fmt"
+ "log"
+ "os/signal"
+ "syscall"
+
+ "github.com/docker/cagent/pkg/agent"
+ "github.com/docker/cagent/pkg/config/latest"
+ "github.com/docker/cagent/pkg/environment"
+ "github.com/docker/cagent/pkg/model/provider/openai"
+ "github.com/docker/cagent/pkg/runtime"
+ "github.com/docker/cagent/pkg/session"
+ "github.com/docker/cagent/pkg/team"
+)
+
+func main() {
+ ctx, cancel := signal.NotifyContext(context.Background(),
+ syscall.SIGINT, syscall.SIGTERM)
+ defer cancel()
+
+ if err := run(ctx); err != nil {
+ log.Fatal(err)
+ }
+}
+
+func run(ctx context.Context) error {
+ // Create model provider
+ llm, err := openai.NewClient(
+ ctx,
+ &latest.ModelConfig{
+ Provider: "openai",
+ Model: "gpt-4o",
+ },
+ environment.NewDefaultProvider(),
+ )
+ if err != nil {
+ return err
+ }
+
+ // Create agent
+ assistant := agent.New(
+ "root",
+ "You are a helpful assistant.",
+ agent.WithModel(llm),
+ agent.WithDescription("A helpful assistant"),
+ )
+
+ // Create team and runtime
+ t := team.New(team.WithAgents(assistant))
+ rt, err := runtime.New(t)
+ if err != nil {
+ return err
+ }
+
+ // Run with a user message
+ sess := session.New(
+ session.WithUserMessage("What is 2 + 2?"),
+ )
+
+ messages, err := rt.Run(ctx, sess)
+ if err != nil {
+ return err
+ }
+
+ // Print the response
+ fmt.Println(messages[len(messages)-1].Message.Content)
+ return nil
+}
+
+Custom Tools
+
+Define custom tools for your agent:
+
+package main
+
+import (
+ "context"
+ "encoding/json"
+ "fmt"
+
+ "github.com/docker/cagent/pkg/tools"
+)
+
+// Define the tool's input schema
+type AddNumbersArgs struct {
+ A int `json:"a"`
+ B int `json:"b"`
+}
+
+// Implement the tool handler
+func addNumbers(_ context.Context, toolCall tools.ToolCall) (*tools.ToolCallResult, error) {
+ var args AddNumbersArgs
+ if err := json.Unmarshal([]byte(toolCall.Function.Arguments), &args); err != nil {
+ return nil, err
+ }
+
+ result := args.A + args.B
+ return tools.ResultSuccess(fmt.Sprintf("%d", result)), nil
+}
+
+func main() {
+ // Create the tool definition
+ addTool := tools.Tool{
+ Name: "add",
+ Category: "math",
+ Description: "Add two numbers together",
+ Parameters: tools.MustSchemaFor[AddNumbersArgs](),
+ Handler: addNumbers,
+ }
+
+ // Use with an agent
+ calculator := agent.New(
+ "root",
+ "You are a calculator. Use the add tool for arithmetic.",
+ agent.WithModel(llm),
+ agent.WithTools(addTool),
+ )
+ // ...
+}
+
+Streaming Responses
+
+Process events as they happen:
+
+func runStreaming(ctx context.Context, rt *runtime.Runtime, sess *session.Session) error {
+ events := rt.RunStream(ctx, sess)
+
+ for event := range events {
+ switch e := event.(type) {
+ case *runtime.StreamStartedEvent:
+ fmt.Println("Stream started")
+
+ case *runtime.AgentChoiceEvent:
+ // Print response chunks as they arrive
+ fmt.Print(e.Content)
+
+ case *runtime.ToolCallEvent:
+ fmt.Printf("\n[Tool call: %s]\n", e.ToolCall.Function.Name)
+
+ case *runtime.ToolCallConfirmationEvent:
+ // Auto-approve tool calls
+ rt.Resume(ctx, runtime.ResumeRequest{
+ Type: runtime.ResumeTypeApproveSession,
+ })
+
+ case *runtime.ToolCallResponseEvent:
+ fmt.Printf("[Tool response: %s]\n", e.Response)
+
+ case *runtime.StreamStoppedEvent:
+ fmt.Println("\nStream stopped")
+
+ case *runtime.ErrorEvent:
+ return fmt.Errorf("error: %s", e.Error)
+ }
+ }
+
+ return nil
+}
+
+Multi-Agent Teams
+
+Create agents that delegate to sub-agents:
+
+package main
+
+import (
+ "github.com/docker/cagent/pkg/agent"
+ "github.com/docker/cagent/pkg/team"
+ "github.com/docker/cagent/pkg/tools/builtin"
+)
+
+func createTeam(llm provider.Provider) *team.Team {
+ // Create a child agent
+ researcher := agent.New(
+ "researcher",
+ "You research topics thoroughly.",
+ agent.WithModel(llm),
+ agent.WithDescription("Research specialist"),
+ )
+
+ // Create root agent with sub-agents
+ coordinator := agent.New(
+ "root",
+ "You coordinate research tasks.",
+ agent.WithModel(llm),
+ agent.WithDescription("Team coordinator"),
+ agent.WithSubAgents(researcher),
+ agent.WithToolSets(builtin.NewTransferTaskTool()),
+ )
+
+ return team.New(team.WithAgents(coordinator, researcher))
+}
+
+Built-in Tools
+
+Use cagent's built-in tools:
+
+import (
+ "github.com/docker/cagent/pkg/config"
+ "github.com/docker/cagent/pkg/tools/builtin"
+)
+
+func createAgentWithBuiltinTools(llm provider.Provider) *agent.Agent {
+ // Runtime config for tools that need it
+ rtConfig := &config.RuntimeConfig{
+ Config: config.Config{
+ WorkingDir: "/path/to/workdir",
+ },
+ }
+
+ return agent.New(
+ "root",
+ "You are a developer assistant.",
+ agent.WithModel(llm),
+ agent.WithToolSets(
+ // Shell tool for running commands
+ builtin.NewShellTool(os.Environ(), rtConfig, nil),
+ // Filesystem tools
+ builtin.NewFilesystemTool(rtConfig.Config.WorkingDir, nil),
+ // Think tool for reasoning
+ builtin.NewThinkTool(),
+ // Todo tool for task tracking
+ builtin.NewTodoTool(false), // false = not shared
+ ),
+ )
+}
+
+Using Different Providers
+
+import (
+ "github.com/docker/cagent/pkg/model/provider/anthropic"
+ "github.com/docker/cagent/pkg/model/provider/gemini"
+ "github.com/docker/cagent/pkg/model/provider/openai"
+)
+
+// OpenAI
+openaiClient, _ := openai.NewClient(ctx, &latest.ModelConfig{
+ Provider: "openai",
+ Model: "gpt-4o",
+}, env)
+
+// Anthropic
+anthropicClient, _ := anthropic.NewClient(ctx, &latest.ModelConfig{
+ Provider: "anthropic",
+ Model: "claude-sonnet-4-0",
+ MaxTokens: 64000,
+}, env)
+
+// Google Gemini
+geminiClient, _ := gemini.NewClient(ctx, &latest.ModelConfig{
+ Provider: "google",
+ Model: "gemini-2.5-flash",
+}, env)
+
+Session Options
+
+import "github.com/docker/cagent/pkg/session"
+
+sess := session.New(
+ // Set a title for the session
+ session.WithTitle("Code Review Task"),
+
+ // Add user message
+ session.WithUserMessage("Review this code for bugs"),
+
+ // Limit iterations
+ session.WithMaxIterations(20),
+
+ // Include a file attachment
+ session.WithUserMessage("main.go", codeContent),
+)
+
+Error Handling
+
+messages, err := rt.Run(ctx, sess)
+if err != nil {
+ if errors.Is(err, context.Canceled) {
+ // User cancelled
+ log.Println("Operation cancelled")
+ return nil
+ }
+ if errors.Is(err, context.DeadlineExceeded) {
+ // Timeout
+ log.Println("Operation timed out")
+ return nil
+ }
+ // Other error
+ return fmt.Errorf("runtime error: %w", err)
+}
+
+// Check for errors in the event stream
+for event := range rt.RunStream(ctx, sess) {
+ if errEvent, ok := event.(*runtime.ErrorEvent); ok {
+ return fmt.Errorf("stream error: %s", errEvent.Error)
+ }
+}
+
+Complete Example
+
+See the examples/golibrary directory for complete working examples:
+
+
+ simple/ — Basic agent with no tools
+ tool/ — Custom tool implementation
+ stream/ — Streaming event handling
+ multi/ — Multi-agent with sub-agents
+ builtintool/ — Using built-in tools
+
diff --git a/docs/pages/guides/tips.html b/docs/pages/guides/tips.html
new file mode 100644
index 000000000..7c0823575
--- /dev/null
+++ b/docs/pages/guides/tips.html
@@ -0,0 +1,349 @@
+Tips & Best Practices
+Expert guidance for building effective, efficient, and secure agents.
+
+Configuration Tips
+
+Auto Mode for Quick Start
+
+Don't have a config file? cagent can automatically detect your available API keys and use an appropriate model:
+
+# Automatically uses the best available provider
+cagent run
+
+# Provider priority: OpenAI → Anthropic → Google → Mistral → DMR
+
+The special auto model value also works in configs:
+
+agents:
+ root:
+ model: auto # Uses best available provider
+ description: Adaptive assistant
+ instruction: You are a helpful assistant.
+
+Environment Variable Interpolation
+
+Commands support JavaScript template literal syntax for environment variables:
+
+agents:
+ root:
+ model: openai/gpt-4o
+ description: Deployment assistant
+ instruction: You help with deployments.
+ commands:
+ # Simple variable
+ greet: "Hello ${env.USER}!"
+
+ # With default value
+ deploy: "Deploy to ${env.ENV || 'staging'}"
+
+ # Multiple variables
+ release: "Release ${env.PROJECT} v${env.VERSION || '1.0.0'}"
+
+Model Aliases Are Auto-Pinned
+
+cagent automatically resolves model aliases to their latest pinned versions. This ensures reproducible behavior:
+
+# You write:
+model: anthropic/claude-sonnet-4-5
+
+# cagent resolves to:
+# anthropic/claude-sonnet-4-5-20250929 (or latest available)
+
+To use a specific version, specify it explicitly in your config.
+
+Performance Tips
+
+Defer Tools for Faster Startup
+
+Large MCP toolsets can slow down agent startup. Use defer to load tools on-demand:
+
+agents:
+ root:
+ model: openai/gpt-4o
+ description: Multi-tool assistant
+ instruction: You have many tools available.
+ toolsets:
+ - type: mcp
+ ref: docker:github-official
+ - type: mcp
+ ref: docker:slack
+ - type: mcp
+ ref: docker:linear
+ # Load all tools on first use
+ defer: true
+
+Or defer specific tools:
+
+defer:
+ - "mcp:github:*" # Defer GitHub tools
+ - "mcp:slack:*" # Defer Slack tools
+
+Filter MCP Tools
+
+Many MCP servers expose dozens of tools. Filter to only what you need:
+
+toolsets:
+ - type: mcp
+ ref: docker:github-official
+ # Only expose these specific tools
+ tools:
+ - list_issues
+ - create_issue
+ - get_pull_request
+ - create_pull_request
+
+Fewer tools means faster tool selection and less confusion for the model.
+
+Set max_iterations
+
+Always set max_iterations for agents with powerful tools to prevent infinite loops:
+
+agents:
+ developer:
+ model: anthropic/claude-sonnet-4-0
+ description: Development assistant
+ instruction: You are a developer.
+ max_iterations: 30 # Reasonable limit for development tasks
+ toolsets:
+ - type: filesystem
+ - type: shell
+
+Typical values: 20-30 for development agents, 10-15 for simple tasks.
+
+Reliability Tips
+
+Use Fallback Models
+
+Configure fallback models for resilience against provider outages or rate limits:
+
+agents:
+ root:
+ model: anthropic/claude-sonnet-4-0
+ description: Reliable assistant
+ instruction: You are a helpful assistant.
+ fallback:
+ models:
+ # Different provider for resilience
+ - openai/gpt-4o
+ # Cheaper model as last resort
+ - openai/gpt-4o-mini
+ retries: 2 # Retry 5xx errors twice
+ cooldown: 1m # Stick with fallback for 1 min after rate limit
+
+Best practices for fallback chains:
+
+ - Use different providers for true redundancy
+ - Order by preference (best first)
+ - Include a cheaper/faster model as last resort
+
+
+Use Think Tool for Complex Tasks
+
+The think tool dramatically improves reasoning quality with minimal overhead:
+
+toolsets:
+ - type: think # Always include for complex agents
+
+The agent uses it as a scratchpad for planning and decision-making.
+
+Security Tips
+
+Use --yolo Mode Carefully
+
+The --yolo flag auto-approves all tool calls without confirmation:
+
+# Auto-approve everything (use with caution!)
+cagent run agent.yaml --yolo
+
+When it's appropriate:
+
+ - CI/CD pipelines with controlled inputs
+ - Automated testing
+ - Agents with only safe, read-only tools
+
+
+When to avoid:
+
+ - Interactive sessions with untested prompts
+ - Agents with shell or filesystem write access
+ - Any situation where unreviewed actions could cause harm
+
+
+Combine Permissions with Sandbox
+
+For defense in depth, use both permissions and sandbox mode:
+
+agents:
+ secure_dev:
+ model: anthropic/claude-sonnet-4-0
+ description: Secure development assistant
+ instruction: You are a secure coding assistant.
+ toolsets:
+ - type: filesystem
+ - type: shell
+ # Layer 1: Permission controls
+ permissions:
+ allow:
+ - "read_*"
+ - "shell:cmd=go*"
+ - "shell:cmd=npm*"
+ deny:
+ - "shell:cmd=sudo*"
+ - "shell:cmd=rm*-rf*"
+ # Layer 2: Container isolation
+ sandbox:
+ image: golang:1.23-alpine
+ paths:
+ - ".:rw"
+
+Use Hooks for Audit Logging
+
+Log all tool calls for compliance or debugging:
+
+agents:
+ audited:
+ model: openai/gpt-4o
+ description: Audited assistant
+ instruction: You are a helpful assistant.
+ hooks:
+ post_tool_use:
+ - matcher: "*"
+ hooks:
+ - type: command
+ command: "./scripts/audit-log.sh"
+
+Multi-Agent Tips
+
+Handoffs vs Sub-Agents
+
+Understand the difference between sub_agents and handoffs:
+
+
+
+ sub_agents (transfer_task)
+ Delegates task to a child, waits for result, then continues. The parent remains in control.
+ sub_agents: [researcher, writer]
+
+
+ handoffs (A2A)
+ Transfers control entirely to another agent (possibly remote). One-way handoff.
+ handoffs:
+ - name: specialist
+ url: http://...
+
+
+
+Give Sub-Agents Clear Descriptions
+
+The root agent uses descriptions to decide which sub-agent to delegate to:
+
+agents:
+ root:
+ model: anthropic/claude-sonnet-4-0
+ description: Technical lead
+ instruction: Delegate to specialists based on the task.
+ sub_agents: [frontend, backend, devops]
+
+ frontend:
+ model: openai/gpt-4o
+ # Good: specific and actionable
+ description: |
+ Frontend specialist. Handles React, TypeScript, CSS,
+ UI components, and browser-related issues.
+
+ backend:
+ model: openai/gpt-4o
+ # Good: clear domain boundaries
+ description: |
+ Backend specialist. Handles APIs, databases,
+ server logic, and Go/Python code.
+
+ devops:
+ model: openai/gpt-4o
+ description: |
+ DevOps specialist. Handles CI/CD, Docker, Kubernetes,
+ infrastructure, and deployment pipelines.
+
+Debugging Tips
+
+Enable Debug Logging
+
+Use the --debug flag to see detailed execution logs:
+
+# Default log location: ~/.cagent/cagent.debug.log
+cagent run agent.yaml --debug
+
+# Custom log location
+cagent run agent.yaml --debug --log-file ./debug.log
+
+Check Token Usage
+
+Use the /usage command during a session to see token consumption:
+
+/usage
+
+Token Usage:
+ Input: 12,456 tokens
+ Output: 3,789 tokens
+ Total: 16,245 tokens
+
+Compact Long Sessions
+
+If a session gets too long, use /compact to summarize and reduce context:
+
+/compact
+
+Session compacted. Summary generated and history trimmed.
+
+More Tips
+
+User-Defined Default Model
+
+Set your preferred default model in ~/.config/cagent/config.yaml:
+
+settings:
+ default_model: anthropic/claude-sonnet-4-0
+
+This model is used when you run cagent run without a config file.
+
+GitHub PR Reviewer Example
+
+Use cagent as a GitHub Actions PR reviewer:
+
+# .github/workflows/pr-review.yml
+name: PR Review
+on:
+ pull_request:
+ types: [opened, synchronize]
+
+jobs:
+ review:
+ runs-on: ubuntu-latest
+ steps:
+ - uses: actions/checkout@v4
+ - name: Run cagent review
+ env:
+ ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
+ GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+ run: |
+ # Install cagent
+ curl -fsSL https://get.cagent.dev | sh
+
+ # Run the review
+ cagent exec reviewer.yaml --yolo \
+ "Review PR #${{ github.event.pull_request.number }}"
+
+With a simple reviewer agent:
+
+# reviewer.yaml
+agents:
+ root:
+ model: anthropic/claude-sonnet-4-0
+ description: PR reviewer
+ instruction: |
+ Review pull requests for code quality, bugs, and security issues.
+ Be constructive and specific in your feedback.
+ toolsets:
+ - type: mcp
+ ref: docker:github-official
+ - type: think
diff --git a/docs/pages/providers/local.html b/docs/pages/providers/local.html
new file mode 100644
index 000000000..8bf6b39b7
--- /dev/null
+++ b/docs/pages/providers/local.html
@@ -0,0 +1,194 @@
+Local Models (Ollama, vLLM, LocalAI)
+Run cagent with locally hosted models for privacy, offline use, or cost savings.
+
+Overview
+
+cagent can connect to any OpenAI-compatible local model server. This guide covers the most popular options:
+
+
+ - Ollama — Easy-to-use local model runner
+ - vLLM — High-performance inference server
+ - LocalAI — OpenAI-compatible API for various backends
+
+
+
+ 💡 Docker Model Runner
+ For the easiest local model experience, consider Docker Model Runner which is built into Docker Desktop and requires no additional setup.
+
+
+Ollama
+
+Ollama is a popular tool for running LLMs locally. cagent includes a built-in ollama alias for easy configuration.
+
+Setup
+
+
+ - Install Ollama from ollama.ai
+ - Pull a model:
+
ollama pull llama3.2
+ollama pull qwen2.5-coder
+
+ - Start the Ollama server (usually runs automatically):
+
ollama serve
+
+
+
+Configuration
+
+Use the built-in ollama alias:
+
+agents:
+ root:
+ model: ollama/llama3.2
+ description: Local assistant
+ instruction: You are a helpful assistant.
+
+The ollama alias automatically uses:
+
+ - Base URL:
http://localhost:11434/v1
+ - API Type: OpenAI-compatible
+ - No API key required
+
+
+Custom Port or Host
+
+If Ollama runs on a different host or port:
+
+models:
+ my_ollama:
+ provider: ollama
+ model: llama3.2
+ base_url: http://192.168.1.100:11434/v1
+
+agents:
+ root:
+ model: my_ollama
+ description: Remote Ollama assistant
+ instruction: You are a helpful assistant.
+
+Popular Ollama Models
+
+
+ Model Size Best For
+
+ llama3.23B General purpose, fast
+ llama3.18B Better reasoning
+ qwen2.5-coder7B Code generation
+ mistral7B General purpose
+ codellama7B Code tasks
+ deepseek-coder6.7B Code generation
+
+
+
+vLLM
+
+vLLM is a high-performance inference server optimized for throughput.
+
+Setup
+
+# Install vLLM
+pip install vllm
+
+# Start the server
+python -m vllm.entrypoints.openai.api_server \
+ --model meta-llama/Llama-3.2-3B-Instruct \
+ --port 8000
+
+Configuration
+
+providers:
+ vllm:
+ api_type: openai_chatcompletions
+ base_url: http://localhost:8000/v1
+
+agents:
+ root:
+ model: vllm/meta-llama/Llama-3.2-3B-Instruct
+ description: vLLM-powered assistant
+ instruction: You are a helpful assistant.
+
+LocalAI
+
+LocalAI provides an OpenAI-compatible API that works with various backends.
+
+Setup
+
+# Run with Docker
+docker run -p 8080:8080 --name local-ai \
+ -v ./models:/models \
+ localai/localai:latest-cpu
+
+Configuration
+
+providers:
+ localai:
+ api_type: openai_chatcompletions
+ base_url: http://localhost:8080/v1
+
+agents:
+ root:
+ model: localai/gpt4all-j
+ description: LocalAI assistant
+ instruction: You are a helpful assistant.
+
+Generic Custom Provider
+
+For any OpenAI-compatible server:
+
+providers:
+ my_server:
+ api_type: openai_chatcompletions
+ base_url: http://localhost:8000/v1
+ # token_key: MY_API_KEY # if auth required
+
+agents:
+ root:
+ model: my_server/model-name
+ description: Custom server assistant
+ instruction: You are a helpful assistant.
+
+Performance Tips
+
+
+ ℹ️ Local Model Considerations
+
+ - Memory: Larger models need more RAM/VRAM. A 7B model typically needs 8-16GB RAM.
+ - GPU: GPU acceleration dramatically improves speed. Check your server's GPU support.
+ - Context length: Local models often have smaller context windows than cloud models.
+ - Tool calling: Not all local models support function/tool calling. Test your model's capabilities.
+
+
+
+Example: Offline Development Agent
+
+agents:
+ developer:
+ model: ollama/qwen2.5-coder
+ description: Offline code assistant
+ instruction: |
+ You are a software developer working offline.
+ Focus on code quality and clear explanations.
+ max_iterations: 20
+ toolsets:
+ - type: filesystem
+ - type: shell
+ - type: think
+ - type: todo
+
+Troubleshooting
+
+Connection Refused
+Ensure your model server is running and accessible:
+curl http://localhost:11434/v1/models # Ollama
+curl http://localhost:8000/v1/models # vLLM
+
+Model Not Found
+Verify the model is downloaded/available:
+ollama list # List available Ollama models
+
+Slow Responses
+
+ - Check if GPU acceleration is enabled
+ - Try a smaller model
+ - Reduce
max_tokens in your config
+
diff --git a/docs/pages/providers/mistral.html b/docs/pages/providers/mistral.html
new file mode 100644
index 000000000..d4855e936
--- /dev/null
+++ b/docs/pages/providers/mistral.html
@@ -0,0 +1,110 @@
+Mistral
+Use Mistral AI models with cagent.
+
+Overview
+
+Mistral AI provides powerful language models through an OpenAI-compatible API. cagent includes built-in support for Mistral as an alias provider.
+
+Setup
+
+
+ - Get an API key from Mistral Console
+ - Set the environment variable:
+
export MISTRAL_API_KEY=your-api-key
+
+
+
+Usage
+
+Inline Syntax
+
+The simplest way to use Mistral:
+
+agents:
+ root:
+ model: mistral/mistral-large-latest
+ description: Assistant using Mistral
+ instruction: You are a helpful assistant.
+
+Named Model
+
+For more control over parameters:
+
+models:
+ mistral:
+ provider: mistral
+ model: mistral-large-latest
+ temperature: 0.7
+ max_tokens: 8192
+
+agents:
+ root:
+ model: mistral
+ description: Assistant using Mistral
+ instruction: You are a helpful assistant.
+
+Available Models
+
+
+ Model Description Context
+
+ mistral-large-latestMost capable Mistral model 128K
+ mistral-medium-latestBalanced performance and cost 128K
+ mistral-small-latestFast and cost-effective (default) 128K
+ codestral-latestOptimized for code generation 32K
+ open-mistral-nemoOpen-weight model 128K
+ ministral-8b-latestCompact 8B parameter model 128K
+ ministral-3b-latestSmallest Mistral model 128K
+
+
+
+Check the Mistral Models documentation for the latest available models.
+
+Auto-Detection
+
+When you run cagent run without specifying a config, cagent automatically detects available providers. If MISTRAL_API_KEY is set and higher-priority providers (OpenAI, Anthropic, Google) are not available, Mistral will be used with mistral-small-latest as the default model.
+
+Extended Thinking
+
+Mistral models support thinking mode through the OpenAI-compatible API. By default, cagent enables medium thinking effort:
+
+models:
+ mistral:
+ provider: mistral
+ model: mistral-large-latest
+ thinking_budget: high # minimal, low, medium, high, or none
+
+To disable thinking:
+
+models:
+ mistral:
+ provider: mistral
+ model: mistral-large-latest
+ thinking_budget: none
+
+How It Works
+
+Mistral is implemented as a built-in alias in cagent:
+
+
+ - API Type: OpenAI-compatible (
openai_chatcompletions)
+ - Base URL:
https://api.mistral.ai/v1
+ - Token Variable:
MISTRAL_API_KEY
+
+
+This means Mistral uses the same client as OpenAI, making it fully compatible with all OpenAI features supported by cagent.
+
+Example: Code Assistant
+
+agents:
+ coder:
+ model: mistral/codestral-latest
+ description: Expert code assistant
+ instruction: |
+ You are an expert programmer using Codestral.
+ Write clean, efficient, well-documented code.
+ Explain your reasoning when helpful.
+ toolsets:
+ - type: filesystem
+ - type: shell
+ - type: think
diff --git a/docs/pages/providers/nebius.html b/docs/pages/providers/nebius.html
new file mode 100644
index 000000000..c5058ff73
--- /dev/null
+++ b/docs/pages/providers/nebius.html
@@ -0,0 +1,82 @@
+Nebius
+Use Nebius AI models with cagent.
+
+Overview
+
+Nebius provides AI models through an OpenAI-compatible API. cagent includes built-in support for Nebius as an alias provider.
+
+Setup
+
+
+ - Get an API key from Nebius AI
+ - Set the environment variable:
+
export NEBIUS_API_KEY=your-api-key
+
+
+
+Usage
+
+Inline Syntax
+
+The simplest way to use Nebius:
+
+agents:
+ root:
+ model: nebius/deepseek-ai/DeepSeek-V3
+ description: Assistant using Nebius
+ instruction: You are a helpful assistant.
+
+Named Model
+
+For more control over parameters:
+
+models:
+ nebius_model:
+ provider: nebius
+ model: deepseek-ai/DeepSeek-V3
+ temperature: 0.7
+ max_tokens: 8192
+
+agents:
+ root:
+ model: nebius_model
+ description: Assistant using Nebius
+ instruction: You are a helpful assistant.
+
+Available Models
+
+Nebius hosts various open models. Check the Nebius documentation for the current model catalog.
+
+
+ Model Description
+
+ deepseek-ai/DeepSeek-V3DeepSeek V3 model
+ Qwen/Qwen2.5-72B-InstructQwen 2.5 72B instruction-tuned
+ meta-llama/Llama-3.3-70B-InstructLlama 3.3 70B instruction-tuned
+
+
+
+How It Works
+
+Nebius is implemented as a built-in alias in cagent:
+
+
+ - API Type: OpenAI-compatible (
openai_chatcompletions)
+ - Base URL:
https://api.studio.nebius.ai/v1
+ - Token Variable:
NEBIUS_API_KEY
+
+
+Example: Code Assistant
+
+agents:
+ coder:
+ model: nebius/deepseek-ai/DeepSeek-V3
+ description: Code assistant using DeepSeek
+ instruction: |
+ You are an expert programmer using DeepSeek V3.
+ Write clean, well-documented code.
+ Follow best practices for the language being used.
+ toolsets:
+ - type: filesystem
+ - type: shell
+ - type: think
diff --git a/docs/pages/providers/xai.html b/docs/pages/providers/xai.html
new file mode 100644
index 000000000..1260936fb
--- /dev/null
+++ b/docs/pages/providers/xai.html
@@ -0,0 +1,95 @@
+xAI (Grok)
+Use xAI's Grok models with cagent.
+
+Overview
+
+xAI provides the Grok family of models through an OpenAI-compatible API. cagent includes built-in support for xAI as an alias provider.
+
+Setup
+
+
+ - Get an API key from xAI Console
+ - Set the environment variable:
+
export XAI_API_KEY=your-api-key
+
+
+
+Usage
+
+Inline Syntax
+
+The simplest way to use xAI:
+
+agents:
+ root:
+ model: xai/grok-3
+ description: Assistant using Grok
+ instruction: You are a helpful assistant.
+
+Named Model
+
+For more control over parameters:
+
+models:
+ grok:
+ provider: xai
+ model: grok-3
+ temperature: 0.7
+ max_tokens: 8192
+
+agents:
+ root:
+ model: grok
+ description: Assistant using Grok
+ instruction: You are a helpful assistant.
+
+Available Models
+
+
+ Model Description Context
+
+ grok-3Latest and most capable Grok model 131K
+ grok-3-fastFaster variant with lower latency 131K
+ grok-3-miniCompact model for simpler tasks 131K
+ grok-3-mini-fastFast variant of the mini model 131K
+ grok-2Previous generation model 128K
+ grok-visionVision-capable model 32K
+
+
+
+Check the xAI documentation for the latest available models.
+
+Extended Thinking
+
+Grok models support thinking mode through the OpenAI-compatible API:
+
+models:
+ grok:
+ provider: xai
+ model: grok-3
+ thinking_budget: high # minimal, low, medium, high, or none
+
+How It Works
+
+xAI is implemented as a built-in alias in cagent:
+
+
+ - API Type: OpenAI-compatible (
openai_chatcompletions)
+ - Base URL:
https://api.x.ai/v1
+ - Token Variable:
XAI_API_KEY
+
+
+Example: Research Assistant
+
+agents:
+ researcher:
+ model: xai/grok-3
+ description: Research assistant with real-time knowledge
+ instruction: |
+ You are a research assistant using Grok.
+ Provide well-researched, factual responses.
+ Cite sources when available.
+ toolsets:
+ - type: mcp
+ ref: docker:duckduckgo
+ - type: think
diff --git a/docs/pages/tools/api.html b/docs/pages/tools/api.html
new file mode 100644
index 000000000..56dd4bd7a
--- /dev/null
+++ b/docs/pages/tools/api.html
@@ -0,0 +1,201 @@
+API Tool
+Create custom tools that call HTTP APIs.
+
+Overview
+
+The API tool type lets you define custom tools that make HTTP requests to external APIs. This is useful for integrating agents with REST APIs, webhooks, or any HTTP-based service without writing code.
+
+
+ ℹ️ When to Use
+
+ - Integrating with REST APIs that don't have an MCP server
+ - Simple HTTP operations (GET, POST)
+ - Quick prototyping before building a full MCP server
+
+
+
+Configuration
+
+agents:
+ assistant:
+ model: openai/gpt-4o
+ description: Assistant with API access
+ instruction: You can look up weather information.
+ toolsets:
+ - type: api
+ name: get_weather
+ method: GET
+ endpoint: "https://api.weather.example/v1/current?city=${city}"
+ instruction: Get current weather for a city
+ args:
+ city:
+ type: string
+ description: City name to get weather for
+ required: ["city"]
+ headers:
+ Authorization: "Bearer ${env.WEATHER_API_KEY}"
+
+Properties
+
+
+ Property Type Required Description
+
+ namestring ✓ Tool name (how the agent references it)
+ methodstring ✓ HTTP method: GET or POST
+ endpointstring ✓ URL endpoint (supports ${param} interpolation)
+ instructionstring ✗ Description shown to the agent
+ argsobject ✗ Parameter definitions (JSON Schema properties)
+ requiredarray ✗ List of required parameter names
+ headersobject ✗ HTTP headers to include
+ output_schemaobject ✗ JSON Schema for the response (for documentation)
+
+
+
+HTTP Methods
+
+GET Requests
+
+For GET requests, parameters are interpolated into the URL:
+
+toolsets:
+ - type: api
+ name: search_users
+ method: GET
+ endpoint: "https://api.example.com/users?q=${query}&limit=${limit}"
+ instruction: Search for users by name
+ args:
+ query:
+ type: string
+ description: Search query
+ limit:
+ type: integer
+ description: Maximum results (default 10)
+ required: ["query"]
+
+POST Requests
+
+For POST requests, parameters are sent as JSON in the request body:
+
+toolsets:
+ - type: api
+ name: create_task
+ method: POST
+ endpoint: "https://api.example.com/tasks"
+ instruction: Create a new task
+ args:
+ title:
+ type: string
+ description: Task title
+ description:
+ type: string
+ description: Task description
+ priority:
+ type: string
+ enum: ["low", "medium", "high"]
+ description: Task priority
+ required: ["title"]
+ headers:
+ Content-Type: "application/json"
+ Authorization: "Bearer ${env.API_TOKEN}"
+
+URL Interpolation
+
+Use ${param} syntax to insert parameter values into URLs:
+
+endpoint: "https://api.example.com/users/${user_id}/posts/${post_id}"
+
+Parameter values are URL-encoded automatically.
+
+Headers
+
+Headers can include environment variables:
+
+headers:
+ Authorization: "Bearer ${env.API_KEY}"
+ X-Custom-Header: "static-value"
+ Content-Type: "application/json"
+
+Output Schema
+
+Optionally document the expected response format:
+
+toolsets:
+ - type: api
+ name: get_user
+ method: GET
+ endpoint: "https://api.example.com/users/${id}"
+ instruction: Get user details by ID
+ args:
+ id:
+ type: string
+ description: User ID
+ required: ["id"]
+ output_schema:
+ type: object
+ properties:
+ id:
+ type: string
+ name:
+ type: string
+ email:
+ type: string
+ created_at:
+ type: string
+
+Example: GitHub API
+
+agents:
+ github_assistant:
+ model: openai/gpt-4o
+ description: Assistant that can query GitHub
+ instruction: You can look up GitHub repositories and users.
+ toolsets:
+ - type: api
+ name: get_repo
+ method: GET
+ endpoint: "https://api.github.com/repos/${owner}/${repo}"
+ instruction: Get information about a GitHub repository
+ args:
+ owner:
+ type: string
+ description: Repository owner (user or org)
+ repo:
+ type: string
+ description: Repository name
+ required: ["owner", "repo"]
+ headers:
+ Accept: "application/vnd.github.v3+json"
+ Authorization: "Bearer ${env.GITHUB_TOKEN}"
+
+ - type: api
+ name: get_user
+ method: GET
+ endpoint: "https://api.github.com/users/${username}"
+ instruction: Get information about a GitHub user
+ args:
+ username:
+ type: string
+ description: GitHub username
+ required: ["username"]
+ headers:
+ Accept: "application/vnd.github.v3+json"
+
+Limitations
+
+
+ - Only supports GET and POST methods
+ - Response body is limited to 1MB
+ - 30 second timeout per request
+ - Only HTTP and HTTPS URLs are supported
+ - No support for file uploads or multipart forms
+
+
+
+ 💡 For Complex APIs
+ For APIs that need authentication flows, pagination, or complex request/response handling, consider using an MCP server instead. The API tool is best for simple, stateless HTTP operations.
+
+
+
+ ⚠️ Security
+ API keys and tokens in headers are visible in debug logs. Use environment variables (${env.VAR}) rather than hardcoding secrets in configuration files.
+
diff --git a/docs/pages/tools/lsp.html b/docs/pages/tools/lsp.html
new file mode 100644
index 000000000..91be6a7ec
--- /dev/null
+++ b/docs/pages/tools/lsp.html
@@ -0,0 +1,183 @@
+LSP Tool
+Connect to Language Server Protocol servers for code intelligence.
+
+Overview
+
+The LSP tool connects your agent to any Language Server Protocol (LSP) server, providing comprehensive code intelligence capabilities like go-to-definition, find references, diagnostics, and more.
+
+
+ ℹ️ What is LSP?
+ The Language Server Protocol is a standard for providing language features like autocomplete, go-to-definition, and diagnostics. Most programming languages have LSP servers available.
+
+
+Configuration
+
+agents:
+ developer:
+ model: anthropic/claude-sonnet-4-0
+ description: Code developer with LSP support
+ instruction: You are a software developer.
+ toolsets:
+ - type: lsp
+ command: gopls
+ args: []
+ file_types: [".go"]
+ - type: filesystem
+ - type: shell
+
+Properties
+
+
+ Property Type Required Description
+
+ commandstring ✓ LSP server executable command
+ argsarray ✗ Command-line arguments for the LSP server
+ envobject ✗ Environment variables for the LSP process
+ file_typesarray ✗ File extensions this LSP handles (e.g., [".go", ".mod"])
+
+
+
+Available Tools
+
+The LSP toolset provides these tools to the agent:
+
+
+ Tool Description Read-Only
+
+ lsp_workspaceGet workspace info and available capabilities ✓
+ lsp_hoverGet type info and documentation for a symbol ✓
+ lsp_definitionFind where a symbol is defined ✓
+ lsp_referencesFind all references to a symbol ✓
+ lsp_document_symbolsList all symbols in a file ✓
+ lsp_workspace_symbolsSearch symbols across the workspace ✓
+ lsp_diagnosticsGet errors and warnings for a file ✓
+ lsp_code_actionsGet available quick fixes and refactorings ✓
+ lsp_renameRename a symbol across the workspace ✗
+ lsp_formatFormat a file ✗
+ lsp_call_hierarchyFind incoming/outgoing calls ✓
+ lsp_type_hierarchyFind supertypes/subtypes ✓
+ lsp_implementationsFind interface implementations ✓
+ lsp_signature_helpGet function signature at call site ✓
+ lsp_inlay_hintsGet type annotations and parameter names ✓
+
+
+
+Common LSP Servers
+
+Here are configurations for popular languages:
+
+Go (gopls)
+
+toolsets:
+ - type: lsp
+ command: gopls
+ file_types: [".go"]
+
+TypeScript/JavaScript (typescript-language-server)
+
+toolsets:
+ - type: lsp
+ command: typescript-language-server
+ args: ["--stdio"]
+ file_types: [".ts", ".tsx", ".js", ".jsx"]
+
+Python (pylsp)
+
+toolsets:
+ - type: lsp
+ command: pylsp
+ file_types: [".py"]
+
+Rust (rust-analyzer)
+
+toolsets:
+ - type: lsp
+ command: rust-analyzer
+ file_types: [".rs"]
+
+C/C++ (clangd)
+
+toolsets:
+ - type: lsp
+ command: clangd
+ file_types: [".c", ".cpp", ".h", ".hpp"]
+
+Multiple LSP Servers
+
+You can configure multiple LSP servers for different file types:
+
+agents:
+ polyglot:
+ model: anthropic/claude-sonnet-4-0
+ description: Multi-language developer
+ instruction: You are a full-stack developer.
+ toolsets:
+ - type: lsp
+ command: gopls
+ file_types: [".go"]
+ - type: lsp
+ command: typescript-language-server
+ args: ["--stdio"]
+ file_types: [".ts", ".tsx", ".js", ".jsx"]
+ - type: lsp
+ command: pylsp
+ file_types: [".py"]
+ - type: filesystem
+ - type: shell
+
+Workflow Instructions
+
+The LSP tool includes built-in instructions that guide the agent on how to use it effectively. The agent learns to:
+
+
+ - Start with
lsp_workspace to understand available capabilities
+ - Use
lsp_workspace_symbols to find relevant code
+ - Use
lsp_references before modifying any symbol
+ - Check
lsp_diagnostics after every code change
+ - Apply
lsp_format after edits are complete
+
+
+
+ 💡 Best Practice
+ Always include the filesystem tool alongside LSP. The agent needs filesystem access to read and write code files, while LSP provides intelligence about the code.
+
+
+Capability Detection
+
+Not all LSP servers support all features. The agent uses lsp_workspace to discover what's available:
+
+Workspace Information:
+- Root: /path/to/project
+- Server: gopls v0.14.0
+- File types: .go
+
+Available Capabilities:
+- Hover: Yes
+- Go to Definition: Yes
+- Find References: Yes
+- Rename: Yes
+- Code Actions: Yes
+- Formatting: Yes
+- Call Hierarchy: Yes
+- Type Hierarchy: Yes
+...
+
+Position Format
+
+All LSP tools use 1-based line and character positions:
+
+
+ - Line 1 is the first line of the file
+ - Character 1 is the first character on a line
+
+
+{
+ "file": "/path/to/file.go",
+ "line": 42,
+ "character": 15
+}
+
+
+ ⚠️ Server Installation
+ The LSP server must be installed and available in the system PATH. cagent does not install LSP servers automatically. Install them using your language's package manager (e.g., go install golang.org/x/tools/gopls@latest).
+
diff --git a/docs/pages/tools/user-prompt.html b/docs/pages/tools/user-prompt.html
new file mode 100644
index 000000000..80a4be87a
--- /dev/null
+++ b/docs/pages/tools/user-prompt.html
@@ -0,0 +1,173 @@
+User Prompt Tool
+Ask the user questions and collect interactive input during agent execution.
+
+Overview
+
+The user prompt tool allows agents to ask questions and collect input from users during execution. This enables interactive workflows where the agent needs clarification, confirmation, or additional information before proceeding.
+
+
+ ℹ️ When to Use
+
+ - When the agent needs clarification before proceeding
+ - Collecting credentials or configuration values
+ - Presenting choices and getting user decisions
+ - Confirming destructive or important actions
+
+
+
+Configuration
+
+agents:
+ assistant:
+ model: openai/gpt-4o
+ description: Interactive assistant
+ instruction: |
+ You are a helpful assistant. When you need information
+ from the user, use the user_prompt tool to ask them.
+ toolsets:
+ - type: user_prompt
+ - type: filesystem
+ - type: shell
+
+Tool Interface
+
+The user_prompt tool takes these parameters:
+
+
+ Parameter Type Required Description
+
+ messagestring ✓ The question or prompt to display
+ schemaobject ✗ JSON Schema defining expected response structure
+
+
+
+Response Format
+
+The tool returns a JSON response:
+
+{
+ "action": "accept",
+ "content": {
+ "field1": "user value",
+ "field2": true
+ }
+}
+
+Action Values
+
+
+ Action Meaning
+
+ acceptUser provided a response (check content)
+ declineUser declined to answer
+ cancelUser cancelled the prompt
+
+
+
+Schema Examples
+
+Simple String Input
+
+{
+ "type": "string",
+ "title": "API Key",
+ "description": "Enter your API key"
+}
+
+Multiple Choice
+
+{
+ "type": "string",
+ "enum": ["development", "staging", "production"],
+ "title": "Environment",
+ "description": "Select the target environment"
+}
+
+Boolean Confirmation
+
+{
+ "type": "boolean",
+ "title": "Confirm",
+ "description": "Are you sure you want to proceed?"
+}
+
+Object with Multiple Fields
+
+{
+ "type": "object",
+ "properties": {
+ "username": {
+ "type": "string",
+ "description": "Your username"
+ },
+ "password": {
+ "type": "string",
+ "description": "Your password"
+ },
+ "remember": {
+ "type": "boolean",
+ "description": "Remember credentials"
+ }
+ },
+ "required": ["username", "password"]
+}
+
+Number Input
+
+{
+ "type": "integer",
+ "title": "Port Number",
+ "description": "Enter the port number (1024-65535)",
+ "minimum": 1024,
+ "maximum": 65535
+}
+
+Example Usage
+
+Here's how an agent might use the user prompt tool:
+
+Agent: I need to deploy this application. Let me ask which environment to target.
+
+[Calls user_prompt with message: "Which environment should I deploy to?"
+ and schema with enum: ["development", "staging", "production"]]
+
+User selects: "staging"
+
+Agent: Great, I'll deploy to staging. Let me confirm this action.
+
+[Calls user_prompt with message: "Deploy to staging? This will replace the current version."
+ and schema with type: "boolean"]
+
+User confirms: true
+
+Agent: Deploying to staging...
+
+UI Presentation
+
+How the prompt appears depends on the interface:
+
+
+ - TUI: Displays an interactive dialog with appropriate input controls
+ - CLI (exec mode): Prints the prompt and reads from stdin
+ - API/MCP: Returns an elicitation request to the client
+
+
+
+ 💡 Best Practice
+ Provide clear, concise messages. Include context about why you're asking and what the information will be used for. Use schemas with descriptions to guide users on expected input format.
+
+
+Handling Responses
+
+The agent should handle all possible actions:
+
+
+ - accept: Process the
content and continue
+ - decline: Acknowledge and try an alternative approach or explain what's needed
+ - cancel: Stop the current operation gracefully
+
+
+
+ ⚠️ Context Requirement
+ The user prompt tool requires an elicitation handler to be configured. It works in the TUI and CLI modes but may not be available in all contexts (e.g., some MCP client configurations).
+