Skip to content

holo-q/vtr

Repository files navigation

vtr

Multi-dimensional AI-assisted testing framework with tracing, signal rewards, and continuous feedback for agent guidance.

Traditional testing is binary: pass or fail. This works for humans who understand nuance, but AI agents need richer signals. VTR (Verifier-Trace-Reward) treats tests as reward functions that produce continuous metrics, structured execution traces, and contextual prompts to guide agents toward correct behavior.

Why VTR

  • Continuous signals0.0..1.0 distance from target, not just pass/fail. Enables gradient-based improvement.
  • Multi-dimensional rewards — score coverage, latency, memory simultaneously without hiding trade-offs.
  • Structured tracing — semantic categories (state / decision / effect / boundary / checkpoint) with depth-rendered call trees.
  • Async-safe depth — depth tracked via span ancestry, not thread-local. Correct across tokio task migration.
  • Ring-buffer capture — full-verbosity tracing with zero disk I/O until failure dumps the buffer.
  • Correlation IDs — follow a logical operation across async / process boundaries.
  • "Never done" recurrence — even when all tests pass, VTR suggests coverage gaps, mutation tests, and adversarial inputs.

Install

[dependencies]
vtr = "0.1"

# optional features
# x11         — Xvfb/Xephyr orchestration, XTEST input, EWMH
# stroboscope — full AST-level tracing (expensive, dev only)
# full        — x11 + stroboscope

Quick start

use vtr::prelude::*;

#[vtr::test]
fn connection_works() -> Result<(), Box<dyn std::error::Error>> {
    let session = connect()?;

    // Continuous signals: how close to target?
    vtr::signal!("latency_ms", session.latency_ms(), target: 100.0, minimize);
    vtr::signal!("coverage",   coverage_pct(),       target: 0.95,  maximize);

    // Semantic traces: structured, depth-aware, color-coded
    vtr::state!("session", "Disconnected" => "Connected");
    vtr::checkpoint!("ready");

    Ok(())
}

Run with cargo test. VTR produces a rich report:

═══ VTR Report ═══
Score: 0.92 (target: 1.0)

Tests: 1 passed, 0 failed

Signals:
  latency_ms:  85   [████████░░] target=100  ↓  ✓
  coverage:    0.92 [█████████░] target=0.95 ↑  ✗

Trace:
  → connection_works
    ◆ Disconnected → Connected
    ● ready
  ← connection_works ok (85ms)

Actions:
  • Coverage at 92% — add tests for error paths in auth.rs
  • Try adversarial: empty input, boundary values

The emergent loop

VTR runs one-shot. The agent reacts:

agent runs cargo test → VTR emits scored trace → agent reads, fixes → loop

VTR's job is not to be the loop. It is to telegraph clearly so the agent knows what's wrong, how wrong, in which direction to improve, and what concrete next action to try. The loop is emergent.

Modules

Module Purpose
signals Binary / Scalar / Vector / Gradient reward types
trace Semantic macros, depth-rendered formatter, ring buffer, journald sinks
harness #[vtr::test] runtime, signal! macro, thread-local context
feedback Meta-review triggers, prompt templates, contextual guidance
recurrence NeverDone, coverage gaps, mutation hints, adversarial inputs
logging Centralized init, SIGHUP hot-reload, TUI buffering
x11 (feature-gated) Xvfb/Xephyr, XTEST, EWMH for GUI integration

License

MIT

Releases

No releases published

Packages

 
 
 

Contributors

Languages