Problem
Agentic workflows currently produce freeform text in their agent output. Downstream jobs that consume this output must parse unstructured natural language, which is fragile, non-deterministic, and breaks when the LLM changes phrasing. There is no built-in mechanism to declare that an agent's response must conform to a specific JSON schema, even though the underlying models (OpenAI, Anthropic, Gemini) all support constrained/structured output natively.
This creates a gap: workflows that need machine-readable output from the agent job — for routing, aggregation, database writes, or chaining into subsequent agent jobs — must resort to brittle regex parsing or prompt-engineering hacks ("respond only in JSON"), with no schema validation and no compile-time safety.
Why this matters for agentic workflows
The safe-outputs feature already demonstrates that gh-aw understands the need for structured side-effects (comments, PRs, labels). Structured output mode extends this principle to the agent's primary response — the content the agent produces, not just the actions it takes.
Use Cases
-
Multi-agent pipelines: Agent A produces a structured analysis → Agent B consumes it as typed input. Today this requires an intermediate parsing job and fails silently on schema drift.
-
Data extraction workflows: An agent reads an issue, extracts fields (severity, component, repro steps) into a JSON object, and a downstream job writes them to a project board or database. Without schema enforcement, missing fields cause silent data loss.
-
Decision routing: An agent evaluates a PR and outputs { "decision": "APPROVE" | "REQUEST_CHANGES" | "ESCALATE", "reasoning": "..." }. The downstream job routes based on decision. Freeform text makes this routing unreliable.
-
Aggregation across matrix jobs: Combined with #26598 (matrix strategy), parallel agent instances each produce a typed result object. A fan-in job aggregates them. Without schema enforcement, one malformed response breaks the entire aggregation.
-
Audit and compliance: Regulated industries need deterministic, schema-validated output for audit trails — not "the LLM usually returns JSON."
Proposed Solution
Add a structured-output configuration in the workflow frontmatter that declares a JSON schema the agent's final response must conform to.
Frontmatter syntax
---
on:
issues:
types: [labeled]
engine:
id: copilot
structured-output:
schema:
type: object
properties:
decision:
type: string
enum: [APPROVE, REQUEST_CHANGES, ESCALATE]
reasoning:
type: string
minLength: 10
confidence:
type: number
minimum: 0
maximum: 1
required: [decision, reasoning, confidence]
additionalProperties: false
tools:
github:
toolsets: [default]
---
Alternative: schema file reference
For larger schemas, reference an external file:
structured-output:
schema-file: .github/schemas/triage-output.schema.json
Runtime behavior
- Compile-time:
gh aw compile validates the schema is well-formed JSON Schema (draft-07 or 2020-12).
- Agent invocation: The runtime passes the schema to the underlying LLM using its native structured output mechanism:
- OpenAI:
response_format: { type: "json_schema", json_schema: { ... } }
- Anthropic: Tool use with a single tool matching the schema
- Other engines: Prompt-based enforcement with post-hoc validation as fallback
- Validation: After the agent responds, the runtime validates the response against the declared schema before making it available to downstream jobs.
- Failure mode: If validation fails, the runtime retries once with an error-correction prompt. If validation fails again, the workflow fails with a clear schema-violation error (not a silent pass-through of malformed data).
Output access in downstream jobs
The validated JSON is available as a typed output:
jobs:
route:
runs-on: ubuntu-latest
needs: agent
steps:
- run: |
DECISION='${{ needs.agent.outputs.structured.decision }}'
if [ "$DECISION" = "ESCALATE" ]; then
gh issue edit $ISSUE --add-label "needs-human"
fi
Prior Art
safe-outputs already constrains agent side-effects (add-comment, create-issue, etc.) with typed schemas. Structured output mode is the read-side complement — constraining the agent's response content, not just its actions.
- #28863 (Large content not passed to output job) highlights existing friction in the output pipeline. Structured output would benefit from a robust output transport regardless of size.
- #26598 (Matrix strategy) — structured output pairs naturally with matrix fan-out/fan-in patterns.
- OpenAI Structured Outputs (GA since Aug 2024):
response_format: { type: "json_schema" } with 100% schema adherence guarantee.
- Anthropic tool_use: Constrained output via single-tool invocation with JSON schema.
- LangChain/LangGraph:
with_structured_output() is the standard pattern for schema-constrained LLM responses.
Alternatives Considered
| Approach |
Limitation |
| Prompt-engineer "respond in JSON" |
No enforcement; LLM can drift, wrap in markdown, or omit fields |
Parse output with jq in downstream job |
Fails silently on schema drift; no retry; no compile-time safety |
| Custom job that validates + retries |
Adds a full extra job layer per workflow; not first-class |
Use safe-outputs as a proxy |
Safe outputs are for side-effects, not the agent's primary response; shoehorning data into a comment or file is a workaround |
| External validation MCP server |
Requires extra infra; moves schema definition away from the workflow |
Scope Questions for Maintainers
- Schema format: JSON Schema draft-07 (widely supported) or 2020-12 (latest)?
- Engine coverage: Start with Copilot/OpenAI only (native structured output), or include Claude/Codex with fallback validation from day one?
- Retry policy: Single retry with error-correction prompt, or configurable (
max-retries: 3)?
- Schema composition: Should
structured-output support $ref to shared schema definitions, or keep it self-contained per workflow?
- Interaction with
safe-outputs: Can a workflow declare both structured output (for the response body) and safe outputs (for side-effects) simultaneously?
Additional Context
This feature would make gh-aw competitive with framework-level structured output support (LangChain, CrewAI, AutoGen) while staying native to the GitHub Actions paradigm. The safe-outputs design already demonstrates that gh-aw can enforce output contracts — structured output extends that contract to the agent's primary response.
Problem
Agentic workflows currently produce freeform text in their agent output. Downstream jobs that consume this output must parse unstructured natural language, which is fragile, non-deterministic, and breaks when the LLM changes phrasing. There is no built-in mechanism to declare that an agent's response must conform to a specific JSON schema, even though the underlying models (OpenAI, Anthropic, Gemini) all support constrained/structured output natively.
This creates a gap: workflows that need machine-readable output from the agent job — for routing, aggregation, database writes, or chaining into subsequent agent jobs — must resort to brittle regex parsing or prompt-engineering hacks (
"respond only in JSON"), with no schema validation and no compile-time safety.Why this matters for agentic workflows
The
safe-outputsfeature already demonstrates that gh-aw understands the need for structured side-effects (comments, PRs, labels). Structured output mode extends this principle to the agent's primary response — the content the agent produces, not just the actions it takes.Use Cases
Multi-agent pipelines: Agent A produces a structured analysis → Agent B consumes it as typed input. Today this requires an intermediate parsing job and fails silently on schema drift.
Data extraction workflows: An agent reads an issue, extracts fields (severity, component, repro steps) into a JSON object, and a downstream job writes them to a project board or database. Without schema enforcement, missing fields cause silent data loss.
Decision routing: An agent evaluates a PR and outputs
{ "decision": "APPROVE" | "REQUEST_CHANGES" | "ESCALATE", "reasoning": "..." }. The downstream job routes based ondecision. Freeform text makes this routing unreliable.Aggregation across matrix jobs: Combined with #26598 (matrix strategy), parallel agent instances each produce a typed result object. A fan-in job aggregates them. Without schema enforcement, one malformed response breaks the entire aggregation.
Audit and compliance: Regulated industries need deterministic, schema-validated output for audit trails — not "the LLM usually returns JSON."
Proposed Solution
Add a
structured-outputconfiguration in the workflow frontmatter that declares a JSON schema the agent's final response must conform to.Frontmatter syntax
Alternative: schema file reference
For larger schemas, reference an external file:
Runtime behavior
gh aw compilevalidates the schema is well-formed JSON Schema (draft-07 or 2020-12).response_format: { type: "json_schema", json_schema: { ... } }Output access in downstream jobs
The validated JSON is available as a typed output:
Prior Art
safe-outputsalready constrains agent side-effects (add-comment, create-issue, etc.) with typed schemas. Structured output mode is the read-side complement — constraining the agent's response content, not just its actions.response_format: { type: "json_schema" }with 100% schema adherence guarantee.with_structured_output()is the standard pattern for schema-constrained LLM responses.Alternatives Considered
jqin downstream jobsafe-outputsas a proxyScope Questions for Maintainers
max-retries: 3)?structured-outputsupport$refto shared schema definitions, or keep it self-contained per workflow?safe-outputs: Can a workflow declare both structured output (for the response body) and safe outputs (for side-effects) simultaneously?Additional Context
This feature would make gh-aw competitive with framework-level structured output support (LangChain, CrewAI, AutoGen) while staying native to the GitHub Actions paradigm. The
safe-outputsdesign already demonstrates that gh-aw can enforce output contracts — structured output extends that contract to the agent's primary response.