Document callback URL and compose storage by santoshkumarradha · Pull Request #8 · Agent-Field/agentfield

santoshkumarradha · 2025-11-11T19:14:34Z

Summary

configure the docker control-plane service to export the PostgreSQL storage provider env vars
document how to set AGENT_CALLBACK_URL when agents run outside the compose network

Testing

not run (not requested)

Adds environment variables to the Docker Compose file to configure the control plane to use PostgreSQL for storage. Updates the README with instructions on setting AGENT_CALLBACK_URL for agents running outside Docker, as they cannot reach localhost.

Surfaced by the first end-to-end docker test of a codex-built medical-triage backend. Fixes 5 real bugs that hid behind py_compile + docker compose config validation, plus pushes the architecture philosophy from "flat orchestrator fans out specialists" to "deep DAG of reasoners as software APIs". ## Bugs fixed 1. **Broken healthcheck — agentfield/control-plane:latest is distroless.** The image has no /bin/sh, no wget, no curl. The CMD-based healthcheck ["wget", "--quiet", ...] always failed, blocking every first build with "dependency failed to start: container is unhealthy". Drop the healthcheck entirely + switch depends_on to condition: service_started. The agent SDK already retries connection on startup. File: control-plane/internal/templates/docker/docker-compose.yml.tmpl 2. **Dead default model — openrouter/anthropic/claude-3.5-sonnet returns 404 from OpenRouter** (litellm.NotFoundError: No endpoints found for anthropic/claude-3.5-sonnet). Every previously generated example would crash on first real curl. Replace with openrouter/google/gemini-2.5-flash (verified working in the live test) across: - SKILL.md, all 6 reference files - control-plane/internal/cli/doctor.go (Recommendation block) - control-plane/internal/cli/init.go (--default-model default) - control-plane/internal/templates/templates.go (TemplateData doc comment) - control-plane/internal/templates/python/main.py.tmpl (env default) 3. **90s sync execute timeout undocumented.** The control plane has a hard 90-second timeout on POST /api/v1/execute/<target>. Slow models (minimax- m2.7, Claude Sonnet, o1) and large fan-outs blow it. Generated systems would hit HTTP 400 {"error":"execution timeout after 1m30s"} with no guidance. Document the limit + the async fallback path (POST /api/v1/execute/async) in verification.md, plus point at gemini-2.5-flash as the recommended fast default. 4. **Discovery API curl shape was wrong everywhere.** The skill teaches `.reasoners[] | select(.node_id=="X") | .name` but the actual response is `.capabilities[].reasoners[]` with `agent_id` (not `node_id`) and `id` (not `name`). Same for /api/v1/nodes — its default ?health_status= active filter hides healthy nodes that haven't reported "active" yet, so use ?health_status=any. Fix in SKILL.md and verification.md. 5. **Python init template violated the skill's own hard rules.** The scaffold from `af init` was using app.serve(auto_port=True) and hardcoding agentfield_server, which the skill explicitly rejects. Codex had to fully rewrite main.py on every build. Update the template to use app.run(auto_port=False), env-driven AGENT_NODE_ID/AGENTFIELD_SERVER/ AI_MODEL/PORT, and a real AIConfig. The scaffold is now consistent with the skill's mandatory patterns out of the box. ## New philosophy: reasoners as software APIs Codex's first build (and the loan-underwriter before it) produced a "fat orchestrator + flat specialists" star pattern: depth-2 DAG, single-layer parallelism, every specialist has a 50-line .ai() prompt, no reuse across branches. That's basically asyncio.gather([llm_call_1, llm_call_2, ...]) with extra ceremony. The right shape is **deep composition cascade**: each reasoner has a single cognitive responsibility, the orchestrator pushes calls DOWN into sub-reasoners, parallelism happens at multiple depths, common sub-reasoners get reused across branches. Each reasoner has a one-line API contract you could write down — they are software APIs. Added to the skill: - New mandatory section "The unit of intelligence is the reasoner — treat them as software APIs" in SKILL.md, with bad/good shape ASCII diagrams, concrete decomposition rules (30-line ceiling, single-judgment rule, reuse-signal extraction), and depth ≥ 3 minimum - New "Reasoner Composition Cascade" pattern (#8) in architecture-patterns.md marked as the master pattern that every other pattern layers onto - Updated "How to pick a pattern" picker to start from cascade as the backbone instead of treating it as one option among many - HARD GATE updated: "If you cannot draw your system as a non-trivial graph with depth ≥ 3, you have not architected anything" - Grooming rule conflict resolved: the skip-the-question rule now lives inside the HARD GATE block so agents see them together, not as competing instructions in separate sections ## Tested end-to-end Live test of the v1 medical-triage build: - docker compose up --build → both containers up - 9 reasoners discovered through /api/v1/discovery/capabilities - Real curl with the Maria Hernandez patient case → CALL_911_NOW with full provenance, 17 second wall clock, HTTP 200, 16KB structured response - The adversarial reviewer correctly steel-manned Pulmonary Embolism (because the chest pain is pleuritic) on top of the AMI primary concern - Deterministic governance overrides fired correctly when committee confidence dipped — the safe-default fallback pattern works in production The build only succeeded after the manual healthcheck patch + the model swap to gemini-2.5-flash. Both fixes are now baked into the templates so the next codex run will produce a working build on first try. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

santoshkumarradha merged commit d2cc225 into main Nov 11, 2025
1 check passed

santoshkumarradha deleted the santosh/docker-config branch November 11, 2025 19:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document callback URL and compose storage#8

Document callback URL and compose storage#8
santoshkumarradha merged 1 commit intomainfrom
santosh/docker-config

santoshkumarradha commented Nov 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

santoshkumarradha commented Nov 11, 2025

Summary

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant