Development

This page covers how to build, test, and contribute to PBench.

Prerequisites

Go 1.25+
Python 3 (required by some benchmark test fixtures that use shell script hooks)

Building

go build ./...

Running Tests

# Run all tests
go test ./... -race -count=1 -timeout 120s

# Run a single test
go test ./... -run TestFunctionName -race -count=1

# Run tests for a specific package
go test ./stage/... -count=1

# Run tests with coverage
go test ./... -coverprofile=coverage.out -covermode=atomic
go tool cover -func=coverage.out | grep total

Linting

go vet ./...
gofmt -w .
staticcheck ./...

Continuous Integration

PBench uses GitHub Actions for CI. The workflow is defined in .github/workflows/ci.yml and runs on every push and pull request to main. It includes the following jobs:

Job	What it does
Test	`go test ./... -v -race -count=1` with Python 3 available for shell script hooks
Coverage	Runs tests with `-coverprofile`, checks against a minimum coverage threshold, uploads the coverage report as an artifact
Lint	`go vet`, `gofmt` formatting check, `staticcheck`
Vulnerability Check	`govulncheck ./...` to scan for known vulnerabilities in dependencies
Build	`go build ./...` and verifies `go.mod` is tidy

Writing Tests

Mock Presto Server

PBench uses prestotest.MockPrestoServer from presto-go for integration testing. This allows tests to run against a realistic Presto-like HTTP server without requiring an actual Presto cluster.

Basic usage

import (
    presto "github.com/ethanyzhang/presto-go"
    "github.com/ethanyzhang/presto-go/prestotest"
)

func TestMyFeature(t *testing.T) {
    mock := prestotest.NewMockPrestoServer()
    defer mock.Close()

    // Register a query with expected SQL and mock response
    mock.AddQuery(&prestotest.MockQueryTemplate{
        SQL:     "SELECT id, name FROM users",
        Columns: []presto.Column{
            {Name: "id", Type: "bigint"},
            {Name: "name", Type: "varchar"},
        },
        Data: [][]any{{1, "alice"}, {2, "bob"}},
    })

    // Create a client pointing at the mock server
    client, _ := presto.NewClient(mock.URL())

    // Use the client in your test...
}

Multi-batch results

Use DataBatches to split results across multiple HTTP responses, simulating how Presto returns large result sets:

mock.AddQuery(&prestotest.MockQueryTemplate{
    SQL:         "SELECT id FROM large_table",
    Columns:     []presto.Column{{Name: "id", Type: "bigint"}},
    Data:        [][]any{{1}, {2}, {3}, {4}, {5}, {6}},
    DataBatches: 3, // Split 6 rows into 3 batches of 2
})

Simulating errors

Use Error and QueueBatches to simulate query failures that appear during result fetching (matching real Presto behavior where errors often appear after the initial queued state):

mock.AddQuery(&prestotest.MockQueryTemplate{
    SQL:          "SELECT * FROM nonexistent",
    QueueBatches: 2,
    Error: &presto.QueryError{
        ErrorName: "TABLE_NOT_FOUND",
        Message:   "Table does not exist",
        ErrorCode: 1,
        ErrorType: "USER_ERROR",
    },
})

Latency simulation

Use SetDefaultLatency to add artificial latency, useful for testing cancellation and timeout behavior:

mock.SetDefaultLatency(100 * time.Millisecond)

Testing `prestoapi.QueryAndUnmarshal`

For code that uses prestoapi.QueryAndUnmarshal, the mock server provides end-to-end testing from query submission through result unmarshalling:

type UserRow struct {
    ID   int    `presto:"id"`
    Name string `presto:"name"`
}

var results []UserRow
err := prestoapi.QueryAndUnmarshal(ctx, &client.Session, "SELECT id, name FROM users", &results)

Testing Stage Execution

The stage package has helper functions for testing full stage graph execution. See stage/stage_test.go for examples of:

Parsing and executing multi-stage benchmark graphs
Random execution mode
Cold/warm runs
Expected row count validation
Save output verification
Context cancellation and abort-on-error behavior

The pattern is:

mock := setupMockServer()
defer mock.Close()

stage, _, _ := ParseStageGraphFromFile("path/to/stage.json")
stage.InitStates()
stage.States.NewClient = func() *presto.Client {
    client, _ := presto.NewClient(mock.URL())
    return client
}
stage.Run(context.Background())

Releasing

Via GitHub Actions (recommended)

Go to Actions > Release > Run workflow on the GitHub repository page. Fill in the version number and optional release notes, then click Run workflow. The workflow will cross-compile, package, and create the release automatically.

Via command line

Requires the GitHub CLI (gh) to be installed and authenticated.

make release VERSION=1.2

What gets released

Both methods produce the same output:

Cross-compile for all platforms (darwin/linux, amd64/arm64)
Package each binary with benchmarks and cluster templates into a tarball
Create a GitHub release tagged v1.2 and upload the four tarballs

The release will contain:

pbench_darwin_amd64.tar.gz
pbench_darwin_arm64.tar.gz
pbench_linux_amd64.tar.gz
pbench_linux_arm64.tar.gz
Source code archives (added automatically by GitHub)

Project Structure

pbench/
  main.go                  # Entry point
  cmd/                     # Cobra command definitions
    cmd.go                 # Root command
    run.go, save.go, ...   # Subcommand wiring (flags, args)
    run/                   # pbench run implementation
    save/                  # pbench save implementation
    loadjson/              # pbench loadjson implementation
    forward/               # pbench forward implementation
    cmp/                   # pbench cmp implementation
    replay/                # pbench replay implementation
    round/                 # pbench round implementation
    genconfig/             # pbench genconfig implementation
    genddl/                # pbench genddl implementation
    queryplan/             # pbench queryplan implementation
  stage/                   # Stage graph parsing, execution, run recorders
  prestoapi/               # Query unmarshalling, SQL splitter, plan node parsing
  utils/                   # Shared utilities (ORM, Row, PrestoFlags, etc.)
  log/                     # Logging wrapper (zerolog)
  clusters/                # Cluster configuration types
  benchmarks/              # Benchmark definition files (TPC-DS, TPC-H, test fixtures)
  .github/workflows/       # CI pipeline

Build Tags

influx — Enables InfluxDB run recorder (stage/influx_run_recorder.go). Without this tag, a no-op stub is used.
experimental — Enables experimental commands (e.g., pbench round).

Command Architecture

Each subcommand lives in cmd/<name>/main.go with a corresponding cmd/<name>.go for cobra wiring (flags, args). Package-level variables for flags and runtime state is the standard cobra pattern — each package corresponds to exactly one command, and only one command runs per process, so this is fine.

Concurrency Pattern

Several commands (save, forward, replay, loadjson) share the same parallelism pattern:

var (
    parallelismGuard chan struct{}   // semaphore for max concurrency
    runningTasks     sync.WaitGroup // tracks in-flight goroutines
)

This could potentially be extracted into a shared WorkerPool utility if more commands adopt this pattern.

Notable Design Choices

Command	Notes
run	Delegates orchestration to `stage/` package. Cleanest separation of concerns. `prepareClient()` auto-detects catalog/schema/timezone mismatches and creates a new client when needed.
forward	Most complex command — polls source cluster, filters/transforms queries, forwards to N target clusters with a query cache for cancellation propagation.
loadjson	Creates a pseudo-stage to reuse run recorder infrastructure. Uses `syncedTime` to infer run start/end from individual query timestamps.
save	`handleQueryError` in `table_summary.go` returns `(retry, fatal)` — callers decide whether to retry or abort. Fatal errors from metadata queries (SHOW CREATE TABLE, SHOW STATS, DESCRIBE) stop the current table; non-fatal errors in partition loops are retried.
genconfig	Uses generic `map[string]any` for parameters instead of typed structs. Computation logic lives in a `.prelude` template file using `set` to mutate the map. The `-p` flag is repeatable for stacking parameter files. Templates reference snake_case JSON keys directly.
genddl	Defaults to TPC-DS but configurable via `workload` and `workload_definition` fields in config file. Paths resolved relative to config file directory.
queryplan	Parses query plan JSON from CSV files and outputs join information as JSON. Uses a string flag `output` for the file path and a separate `outputFile` variable for the `*os.File`.

Error Handling Philosophy

Commands are generally continue-on-error: individual item failures (queries, files, tables) are logged but don't abort the entire operation. This is intentional — for batch operations like loading 1000 JSON files or saving 200 table summaries, a single failure shouldn't stop the whole run. The only exceptions are truly unrecoverable errors (can't connect to server, can't open input file) which use log.Fatal().

Development

Prerequisites

Building

Running Tests

Linting

Continuous Integration

Writing Tests

Mock Presto Server

Basic usage

Multi-batch results

Simulating errors

Latency simulation

Testing prestoapi.QueryAndUnmarshal

Testing Stage Execution

Releasing

Via GitHub Actions (recommended)

Via command line

What gets released

Project Structure

Build Tags

Command Architecture

Concurrency Pattern

Notable Design Choices

Error Handling Philosophy

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Testing `prestoapi.QueryAndUnmarshal`