Skip to content

Document callback URL and compose storage#8

Merged
santoshkumarradha merged 1 commit intomainfrom
santosh/docker-config
Nov 11, 2025
Merged

Document callback URL and compose storage#8
santoshkumarradha merged 1 commit intomainfrom
santosh/docker-config

Conversation

@santoshkumarradha
Copy link
Copy Markdown
Member

Summary

  • configure the docker control-plane service to export the PostgreSQL storage provider env vars
  • document how to set AGENT_CALLBACK_URL when agents run outside the compose network

Testing

  • not run (not requested)

Adds environment variables to the Docker Compose file to configure the control plane to use PostgreSQL for storage.

Updates the README with instructions on setting AGENT_CALLBACK_URL for agents running outside Docker, as they cannot reach localhost.
@santoshkumarradha santoshkumarradha merged commit d2cc225 into main Nov 11, 2025
1 check passed
@santoshkumarradha santoshkumarradha deleted the santosh/docker-config branch November 11, 2025 19:15
santoshkumarradha added a commit that referenced this pull request Apr 8, 2026
Surfaced by the first end-to-end docker test of a codex-built medical-triage
backend. Fixes 5 real bugs that hid behind py_compile + docker compose config
validation, plus pushes the architecture philosophy from "flat orchestrator
fans out specialists" to "deep DAG of reasoners as software APIs".

## Bugs fixed

1. **Broken healthcheck — agentfield/control-plane:latest is distroless.**
   The image has no /bin/sh, no wget, no curl. The CMD-based healthcheck
   ["wget", "--quiet", ...] always failed, blocking every first build with
   "dependency failed to start: container is unhealthy". Drop the healthcheck
   entirely + switch depends_on to condition: service_started. The agent SDK
   already retries connection on startup.
   File: control-plane/internal/templates/docker/docker-compose.yml.tmpl

2. **Dead default model — openrouter/anthropic/claude-3.5-sonnet returns 404
   from OpenRouter** (litellm.NotFoundError: No endpoints found for
   anthropic/claude-3.5-sonnet). Every previously generated example would
   crash on first real curl. Replace with openrouter/google/gemini-2.5-flash
   (verified working in the live test) across:
   - SKILL.md, all 6 reference files
   - control-plane/internal/cli/doctor.go (Recommendation block)
   - control-plane/internal/cli/init.go (--default-model default)
   - control-plane/internal/templates/templates.go (TemplateData doc comment)
   - control-plane/internal/templates/python/main.py.tmpl (env default)

3. **90s sync execute timeout undocumented.** The control plane has a hard
   90-second timeout on POST /api/v1/execute/<target>. Slow models (minimax-
   m2.7, Claude Sonnet, o1) and large fan-outs blow it. Generated systems
   would hit HTTP 400 {"error":"execution timeout after 1m30s"} with no
   guidance. Document the limit + the async fallback path
   (POST /api/v1/execute/async) in verification.md, plus point at
   gemini-2.5-flash as the recommended fast default.

4. **Discovery API curl shape was wrong everywhere.** The skill teaches
   `.reasoners[] | select(.node_id=="X") | .name` but the actual response
   is `.capabilities[].reasoners[]` with `agent_id` (not `node_id`) and
   `id` (not `name`). Same for /api/v1/nodes — its default ?health_status=
   active filter hides healthy nodes that haven't reported "active" yet,
   so use ?health_status=any. Fix in SKILL.md and verification.md.

5. **Python init template violated the skill's own hard rules.** The
   scaffold from `af init` was using app.serve(auto_port=True) and
   hardcoding agentfield_server, which the skill explicitly rejects. Codex
   had to fully rewrite main.py on every build. Update the template to use
   app.run(auto_port=False), env-driven AGENT_NODE_ID/AGENTFIELD_SERVER/
   AI_MODEL/PORT, and a real AIConfig. The scaffold is now consistent with
   the skill's mandatory patterns out of the box.

## New philosophy: reasoners as software APIs

Codex's first build (and the loan-underwriter before it) produced a "fat
orchestrator + flat specialists" star pattern: depth-2 DAG, single-layer
parallelism, every specialist has a 50-line .ai() prompt, no reuse across
branches. That's basically asyncio.gather([llm_call_1, llm_call_2, ...])
with extra ceremony.

The right shape is **deep composition cascade**: each reasoner has a
single cognitive responsibility, the orchestrator pushes calls DOWN into
sub-reasoners, parallelism happens at multiple depths, common sub-reasoners
get reused across branches. Each reasoner has a one-line API contract you
could write down — they are software APIs.

Added to the skill:
- New mandatory section "The unit of intelligence is the reasoner — treat
  them as software APIs" in SKILL.md, with bad/good shape ASCII diagrams,
  concrete decomposition rules (30-line ceiling, single-judgment rule,
  reuse-signal extraction), and depth ≥ 3 minimum
- New "Reasoner Composition Cascade" pattern (#8) in architecture-patterns.md
  marked as the master pattern that every other pattern layers onto
- Updated "How to pick a pattern" picker to start from cascade as the
  backbone instead of treating it as one option among many
- HARD GATE updated: "If you cannot draw your system as a non-trivial
  graph with depth ≥ 3, you have not architected anything"
- Grooming rule conflict resolved: the skip-the-question rule now lives
  inside the HARD GATE block so agents see them together, not as
  competing instructions in separate sections

## Tested end-to-end

Live test of the v1 medical-triage build:
- docker compose up --build → both containers up
- 9 reasoners discovered through /api/v1/discovery/capabilities
- Real curl with the Maria Hernandez patient case →
  CALL_911_NOW with full provenance, 17 second wall clock,
  HTTP 200, 16KB structured response
- The adversarial reviewer correctly steel-manned Pulmonary Embolism
  (because the chest pain is pleuritic) on top of the AMI primary concern
- Deterministic governance overrides fired correctly when committee
  confidence dipped — the safe-default fallback pattern works in production

The build only succeeded after the manual healthcheck patch + the model
swap to gemini-2.5-flash. Both fixes are now baked into the templates so
the next codex run will produce a working build on first try.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
santoshkumarradha added a commit that referenced this pull request Apr 8, 2026
Surfaced by the first end-to-end docker test of a codex-built medical-triage
backend. Fixes 5 real bugs that hid behind py_compile + docker compose config
validation, plus pushes the architecture philosophy from "flat orchestrator
fans out specialists" to "deep DAG of reasoners as software APIs".

## Bugs fixed

1. **Broken healthcheck — agentfield/control-plane:latest is distroless.**
   The image has no /bin/sh, no wget, no curl. The CMD-based healthcheck
   ["wget", "--quiet", ...] always failed, blocking every first build with
   "dependency failed to start: container is unhealthy". Drop the healthcheck
   entirely + switch depends_on to condition: service_started. The agent SDK
   already retries connection on startup.
   File: control-plane/internal/templates/docker/docker-compose.yml.tmpl

2. **Dead default model — openrouter/anthropic/claude-3.5-sonnet returns 404
   from OpenRouter** (litellm.NotFoundError: No endpoints found for
   anthropic/claude-3.5-sonnet). Every previously generated example would
   crash on first real curl. Replace with openrouter/google/gemini-2.5-flash
   (verified working in the live test) across:
   - SKILL.md, all 6 reference files
   - control-plane/internal/cli/doctor.go (Recommendation block)
   - control-plane/internal/cli/init.go (--default-model default)
   - control-plane/internal/templates/templates.go (TemplateData doc comment)
   - control-plane/internal/templates/python/main.py.tmpl (env default)

3. **90s sync execute timeout undocumented.** The control plane has a hard
   90-second timeout on POST /api/v1/execute/<target>. Slow models (minimax-
   m2.7, Claude Sonnet, o1) and large fan-outs blow it. Generated systems
   would hit HTTP 400 {"error":"execution timeout after 1m30s"} with no
   guidance. Document the limit + the async fallback path
   (POST /api/v1/execute/async) in verification.md, plus point at
   gemini-2.5-flash as the recommended fast default.

4. **Discovery API curl shape was wrong everywhere.** The skill teaches
   `.reasoners[] | select(.node_id=="X") | .name` but the actual response
   is `.capabilities[].reasoners[]` with `agent_id` (not `node_id`) and
   `id` (not `name`). Same for /api/v1/nodes — its default ?health_status=
   active filter hides healthy nodes that haven't reported "active" yet,
   so use ?health_status=any. Fix in SKILL.md and verification.md.

5. **Python init template violated the skill's own hard rules.** The
   scaffold from `af init` was using app.serve(auto_port=True) and
   hardcoding agentfield_server, which the skill explicitly rejects. Codex
   had to fully rewrite main.py on every build. Update the template to use
   app.run(auto_port=False), env-driven AGENT_NODE_ID/AGENTFIELD_SERVER/
   AI_MODEL/PORT, and a real AIConfig. The scaffold is now consistent with
   the skill's mandatory patterns out of the box.

## New philosophy: reasoners as software APIs

Codex's first build (and the loan-underwriter before it) produced a "fat
orchestrator + flat specialists" star pattern: depth-2 DAG, single-layer
parallelism, every specialist has a 50-line .ai() prompt, no reuse across
branches. That's basically asyncio.gather([llm_call_1, llm_call_2, ...])
with extra ceremony.

The right shape is **deep composition cascade**: each reasoner has a
single cognitive responsibility, the orchestrator pushes calls DOWN into
sub-reasoners, parallelism happens at multiple depths, common sub-reasoners
get reused across branches. Each reasoner has a one-line API contract you
could write down — they are software APIs.

Added to the skill:
- New mandatory section "The unit of intelligence is the reasoner — treat
  them as software APIs" in SKILL.md, with bad/good shape ASCII diagrams,
  concrete decomposition rules (30-line ceiling, single-judgment rule,
  reuse-signal extraction), and depth ≥ 3 minimum
- New "Reasoner Composition Cascade" pattern (#8) in architecture-patterns.md
  marked as the master pattern that every other pattern layers onto
- Updated "How to pick a pattern" picker to start from cascade as the
  backbone instead of treating it as one option among many
- HARD GATE updated: "If you cannot draw your system as a non-trivial
  graph with depth ≥ 3, you have not architected anything"
- Grooming rule conflict resolved: the skip-the-question rule now lives
  inside the HARD GATE block so agents see them together, not as
  competing instructions in separate sections

## Tested end-to-end

Live test of the v1 medical-triage build:
- docker compose up --build → both containers up
- 9 reasoners discovered through /api/v1/discovery/capabilities
- Real curl with the Maria Hernandez patient case →
  CALL_911_NOW with full provenance, 17 second wall clock,
  HTTP 200, 16KB structured response
- The adversarial reviewer correctly steel-manned Pulmonary Embolism
  (because the chest pain is pleuritic) on top of the AMI primary concern
- Deterministic governance overrides fired correctly when committee
  confidence dipped — the safe-default fallback pattern works in production

The build only succeeded after the manual healthcheck patch + the model
swap to gemini-2.5-flash. Both fixes are now baked into the templates so
the next codex run will produce a working build on first try.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant