Fix structured outputs by pwilkin · Pull Request #20223 · ggml-org/llama.cpp

pwilkin · 2026-03-08T01:26:12Z

Fixes support for structured outputs in autoparser.

pwilkin · 2026-03-08T01:27:52Z

Fixes #20221

tarruda · 2026-03-08T01:35:56Z

@pwilkin thanks for this.

I'm not familiar with the testing infrastructure, but could we add the example request as a regression test for llama-server structured output?

pwilkin · 2026-03-08T01:43:04Z

Can try to add a server test for it, see how the tiny-stories model handles structured output ;)

Sign in to view

        if (has_response_format) {
-            return ctx.reasoning_parser + p.space() +
-                   p.content(p.schema(p.json(), "response-format", inputs.json_schema)) + p.end();
+            return ctx.reasoning_parser + p.space() + p.optional(p.literal("```json") + p.space()) +


Co-authored-by: Aldehir Rojas <hello@alde.dev>

tarruda · 2026-03-08T11:21:30Z

-                   p.content(p.schema(p.json(), "response-format", inputs.json_schema)) + p.end();
+            auto response_format = p.rule("response-format", p.content(p.schema(p.json(), "response-format-schema", inputs.json_schema)));
+            return ctx.reasoning_parser + p.space() + p.choice({
+                p.literal("```json") + p.space() + response_format + p.space() + p.literal("```"),


I might not be interpreting this correctly, but should the model be able to output fenced markdown blocks on a structured response output? The main guarantee of a structured JSON response is that it will only contain valid JSON without having to extract from any markdown blocks.

The parser will extract the JSON within the code fences, you won't see them in the response (see definition wrapped in p.content()).

This is good for models whose training set contain a bunch of JSON examples in code fences, such as Gemma 3.

I've always assumed that llama.cpp manipulated the model's token prediction in a way that ensured only tokens that kept the output valid according to the JSON schema, and that outputting a fenced tokens would not even be possible.

Grammar-constrained decoding works for any context-free language. JSON is one, but so is JSON wrapped in fences. We need to support more than just JSON, especially for reasoning models that need to reason before generating a structured response.

* Fix structured outputs * Update common/chat-auto-parser-generator.cpp Co-authored-by: Aldehir Rojas <hello@alde.dev> --------- Co-authored-by: Aldehir Rojas <hello@alde.dev>

Fix structured outputs

e253042

pwilkin requested a review from aldehir March 8, 2026 01:26

loci-dev mentioned this pull request Mar 8, 2026

UPSTREAM PR #20223: Fix structured outputs auroralabs-loci/llama.cpp#1231

Open

aldehir reviewed Mar 8, 2026

View reviewed changes

Comment thread common/chat-auto-parser-generator.cpp Outdated

Update common/chat-auto-parser-generator.cpp

4b7efec

Co-authored-by: Aldehir Rojas <hello@alde.dev>

aldehir approved these changes Mar 8, 2026

View reviewed changes

tarruda reviewed Mar 8, 2026

View reviewed changes

pwilkin merged commit 62b8143 into ggml-org:master Mar 8, 2026
77 of 78 checks passed

pwilkin mentioned this pull request Mar 8, 2026

Misc. bug: Constrained JSON output is not working #20221

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix structured outputs#20223

Fix structured outputs#20223
pwilkin merged 2 commits intoggml-org:masterfrom
pwilkin:structured-output

pwilkin commented Mar 8, 2026

Uh oh!

pwilkin commented Mar 8, 2026

Uh oh!

tarruda commented Mar 8, 2026

Uh oh!

pwilkin commented Mar 8, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

tarruda Mar 8, 2026

Uh oh!

aldehir Mar 8, 2026 •

edited

Loading

Uh oh!

tarruda Mar 8, 2026

Uh oh!

aldehir Mar 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

pwilkin commented Mar 8, 2026

Uh oh!

pwilkin commented Mar 8, 2026

Uh oh!

tarruda commented Mar 8, 2026

Uh oh!

pwilkin commented Mar 8, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

tarruda Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

aldehir Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tarruda Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

aldehir Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aldehir Mar 8, 2026 •

edited

Loading