Skip to content

Structured output mode — constrain agent responses to a declared JSON schema #28963

@sneric

Description

@sneric

Problem

Agentic workflows currently produce freeform text in their agent output. Downstream jobs that consume this output must parse unstructured natural language, which is fragile, non-deterministic, and breaks when the LLM changes phrasing. There is no built-in mechanism to declare that an agent's response must conform to a specific JSON schema, even though the underlying models (OpenAI, Anthropic, Gemini) all support constrained/structured output natively.

This creates a gap: workflows that need machine-readable output from the agent job — for routing, aggregation, database writes, or chaining into subsequent agent jobs — must resort to brittle regex parsing or prompt-engineering hacks ("respond only in JSON"), with no schema validation and no compile-time safety.

Why this matters for agentic workflows

The safe-outputs feature already demonstrates that gh-aw understands the need for structured side-effects (comments, PRs, labels). Structured output mode extends this principle to the agent's primary response — the content the agent produces, not just the actions it takes.

Use Cases

  1. Multi-agent pipelines: Agent A produces a structured analysis → Agent B consumes it as typed input. Today this requires an intermediate parsing job and fails silently on schema drift.

  2. Data extraction workflows: An agent reads an issue, extracts fields (severity, component, repro steps) into a JSON object, and a downstream job writes them to a project board or database. Without schema enforcement, missing fields cause silent data loss.

  3. Decision routing: An agent evaluates a PR and outputs { "decision": "APPROVE" | "REQUEST_CHANGES" | "ESCALATE", "reasoning": "..." }. The downstream job routes based on decision. Freeform text makes this routing unreliable.

  4. Aggregation across matrix jobs: Combined with #26598 (matrix strategy), parallel agent instances each produce a typed result object. A fan-in job aggregates them. Without schema enforcement, one malformed response breaks the entire aggregation.

  5. Audit and compliance: Regulated industries need deterministic, schema-validated output for audit trails — not "the LLM usually returns JSON."

Proposed Solution

Add a structured-output configuration in the workflow frontmatter that declares a JSON schema the agent's final response must conform to.

Frontmatter syntax

---
on:
  issues:
    types: [labeled]

engine:
  id: copilot

structured-output:
  schema:
    type: object
    properties:
      decision:
        type: string
        enum: [APPROVE, REQUEST_CHANGES, ESCALATE]
      reasoning:
        type: string
        minLength: 10
      confidence:
        type: number
        minimum: 0
        maximum: 1
    required: [decision, reasoning, confidence]
    additionalProperties: false

tools:
  github:
    toolsets: [default]
---

Alternative: schema file reference

For larger schemas, reference an external file:

structured-output:
  schema-file: .github/schemas/triage-output.schema.json

Runtime behavior

  1. Compile-time: gh aw compile validates the schema is well-formed JSON Schema (draft-07 or 2020-12).
  2. Agent invocation: The runtime passes the schema to the underlying LLM using its native structured output mechanism:
    • OpenAI: response_format: { type: "json_schema", json_schema: { ... } }
    • Anthropic: Tool use with a single tool matching the schema
    • Other engines: Prompt-based enforcement with post-hoc validation as fallback
  3. Validation: After the agent responds, the runtime validates the response against the declared schema before making it available to downstream jobs.
  4. Failure mode: If validation fails, the runtime retries once with an error-correction prompt. If validation fails again, the workflow fails with a clear schema-violation error (not a silent pass-through of malformed data).

Output access in downstream jobs

The validated JSON is available as a typed output:

jobs:
  route:
    runs-on: ubuntu-latest
    needs: agent
    steps:
      - run: |
          DECISION='${{ needs.agent.outputs.structured.decision }}'
          if [ "$DECISION" = "ESCALATE" ]; then
            gh issue edit $ISSUE --add-label "needs-human"
          fi

Prior Art

  • safe-outputs already constrains agent side-effects (add-comment, create-issue, etc.) with typed schemas. Structured output mode is the read-side complement — constraining the agent's response content, not just its actions.
  • #28863 (Large content not passed to output job) highlights existing friction in the output pipeline. Structured output would benefit from a robust output transport regardless of size.
  • #26598 (Matrix strategy) — structured output pairs naturally with matrix fan-out/fan-in patterns.
  • OpenAI Structured Outputs (GA since Aug 2024): response_format: { type: "json_schema" } with 100% schema adherence guarantee.
  • Anthropic tool_use: Constrained output via single-tool invocation with JSON schema.
  • LangChain/LangGraph: with_structured_output() is the standard pattern for schema-constrained LLM responses.

Alternatives Considered

Approach Limitation
Prompt-engineer "respond in JSON" No enforcement; LLM can drift, wrap in markdown, or omit fields
Parse output with jq in downstream job Fails silently on schema drift; no retry; no compile-time safety
Custom job that validates + retries Adds a full extra job layer per workflow; not first-class
Use safe-outputs as a proxy Safe outputs are for side-effects, not the agent's primary response; shoehorning data into a comment or file is a workaround
External validation MCP server Requires extra infra; moves schema definition away from the workflow

Scope Questions for Maintainers

  1. Schema format: JSON Schema draft-07 (widely supported) or 2020-12 (latest)?
  2. Engine coverage: Start with Copilot/OpenAI only (native structured output), or include Claude/Codex with fallback validation from day one?
  3. Retry policy: Single retry with error-correction prompt, or configurable (max-retries: 3)?
  4. Schema composition: Should structured-output support $ref to shared schema definitions, or keep it self-contained per workflow?
  5. Interaction with safe-outputs: Can a workflow declare both structured output (for the response body) and safe outputs (for side-effects) simultaneously?

Additional Context

This feature would make gh-aw competitive with framework-level structured output support (LangChain, CrewAI, AutoGen) while staying native to the GitHub Actions paradigm. The safe-outputs design already demonstrates that gh-aw can enforce output contracts — structured output extends that contract to the agent's primary response.

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions