feat: Go eval CLI with CGo FFI bridge by urmzd · Pull Request #9 · urmzd/generative-artifact-protocol

urmzd · 2026-04-14T23:50:32Z

Summary

Adds a Go eval CLI (apps/eval-cli) that benchmarks base vs GAP flows across multiple providers (Google, OpenAI, Ollama, Groq, GitHub Models)
Introduces C FFI bindings (src/cffi.rs) so the Go CLI can call the Rust GAP apply engine via CGo
Fixes Ollama provider ignoring --host flag when OLLAMA_API_KEY is set (was silently redirecting to ollama.com)
Fixes silent error swallowing from saige SDK ErrorDelta — stream errors are now extracted and surfaced in all runner paths
Adds justfile recipes: build-go, test-go, run-go, report-go
Includes experiment 026 results (gemma4 via Ollama): base flow succeeds, GAP envelope apply fails (model capability limitation)

Test plan

go vet ./... passes
go test ./... passes
Smoke tested direct OpenAI adapter vs provider.Build — both produce output
End-to-end run of experiment 026 with Ollama (gemma4) — base flow produces valid artifacts, GAP flow correctly reports parse/apply failures
Test with a cloud provider (Google/OpenAI) for GAP apply success validation

Adds a Go-based eval CLI (apps/eval-cli) that benchmarks base vs GAP flows across providers (Google, OpenAI, Ollama, Groq, GitHub). Uses CGo to call the Rust GAP apply engine via a new C FFI layer (src/cffi.rs). Includes provider factory, experiment runner, structured envelope scoring, markdown report generation, and justfile recipes. Fixes silent error swallowing from saige SDK ErrorDelta by extracting stream errors in all runner paths. Fixes Ollama provider ignoring --host when OLLAMA_API_KEY is set.

urmzd merged commit a47fc53 into main Apr 14, 2026
1 check passed

urmzd deleted the feat/go-eval-cli branch April 14, 2026 23:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Go eval CLI with CGo FFI bridge#9

feat: Go eval CLI with CGo FFI bridge#9
urmzd merged 1 commit intomainfrom
feat/go-eval-cli

urmzd commented Apr 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

urmzd commented Apr 14, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant