Skip to content

feat: grammar-constrained AI review output (Ollama JSON schema)#30

Open
avrabe wants to merge 1 commit intomainfrom
feat/ai-review-json-schema
Open

feat: grammar-constrained AI review output (Ollama JSON schema)#30
avrabe wants to merge 1 commit intomainfrom
feat/ai-review-json-schema

Conversation

@avrabe
Copy link
Copy Markdown
Contributor

@avrabe avrabe commented Apr 27, 2026

Schema mode for the AI review: see commit message + analysis above. Adds REVIEW_JSON_SCHEMA + buildResponseFormat + wires response_format through callLocalAI in strict mode. 773 tests pass. Risk low — old Ollama silently ignores. After deploy, /review-pr should produce correctly-shaped JSON instead of falling to silence on wrong-shape outputs.

🤖 Generated with Claude Code

## Why
Today's live test revealed the small model's failure mode: it can produce
valid JSON of the *wrong shape*. PR #222 got `{"commit_message":"Add
pre-commit configuration files"}` instead of `{verdict, summary,
findings}`. Strict parser correctly rejected; bot fell to silence.

The strict prompt teaches the model the *intent*. The schema enforces the
*shape*. Belt-and-suspenders.

## What
New exports in `src/ai-review-prompt.js`:

- `REVIEW_JSON_SCHEMA` — full JSON Schema for the review payload, with
  `verdict` enum, required fields, and `additionalProperties: false`.
- `buildResponseFormat()` — wraps the schema in OpenAI-compat
  `response_format: { type: 'json_schema', json_schema: { name, strict,
  schema } }`. Ollama (≥0.5) honours this on `/v1/chat/completions` for
  grammar-constrained generation — the decoder physically cannot emit
  tokens that violate the schema.

`src/ai-review.js`:
- `callLocalAI` accepts a new `responseFormat` option. When set, it goes
  on the wire as `response_format`. Older Ollama versions ignore unknown
  fields, so the bot degrades gracefully to strict-prompt + parser path.
- Strict mode passes `buildResponseFormat()` automatically. Legacy
  freeform path (when `system_prompt` is overridden) does not — that path
  expects prose.

## Why this is a bigger deal than it looks
The Netcup VPS is RAM-bound: 3.8 GB total, no swap. Anything bigger than
qwen2.5-coder:3b at Q4 risks OOM on inference. So we cannot just upgrade
the model. Constrained decoding lets the existing 3 B model succeed on
the shape problem the prompt couldn't fix on its own.

## Test plan
- [x] All 773 tests pass (was 766 — added 7 covering schema shape, response
      format builder, wire-payload pass-through, legacy-mode omission)
- [x] eslint clean
- [ ] After deploy: trigger `/review-pr` on a recent PR. Wire-level: the
      Ollama request body should contain `response_format` field. Behaviour:
      no more `"AI review JSON parse failed: invalid verdict: undefined"`
      log lines (the previous failure was wrong-shape JSON; this kills it).

## Risk & rollout
- Risk: low. New field is opt-in via the `responseFormat` argument; absent
  → wire payload unchanged from current behaviour. Old Ollama: silently
  ignores; falls through to existing parser.
- Rollout: self-update on merge.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant