Skip to content

Development

Yiqun (Ethan) Zhang edited this page Feb 24, 2026 · 5 revisions

This page covers how to build, test, and contribute to PBench.

Prerequisites

  • Go 1.25+
  • Python 3 (required by some benchmark test fixtures that use shell script hooks)

Building

go build ./...

Running Tests

# Run all tests
go test ./... -race -count=1 -timeout 120s

# Run a single test
go test ./... -run TestFunctionName -race -count=1

# Run tests for a specific package
go test ./stage/... -count=1

# Run tests with coverage
go test ./... -coverprofile=coverage.out -covermode=atomic
go tool cover -func=coverage.out | grep total

Linting

go vet ./...
gofmt -w .
staticcheck ./...

Continuous Integration

PBench uses GitHub Actions for CI. The workflow is defined in .github/workflows/ci.yml and runs on every push and pull request to main. It includes the following jobs:

Job What it does
Test go test ./... -v -race -count=1 with Python 3 available for shell script hooks
Coverage Runs tests with -coverprofile, checks against a minimum coverage threshold, uploads the coverage report as an artifact
Lint go vet, gofmt formatting check, staticcheck
Vulnerability Check govulncheck ./... to scan for known vulnerabilities in dependencies
Build go build ./... and verifies go.mod is tidy

Writing Tests

Mock Presto Server

PBench uses prestotest.MockPrestoServer from presto-go for integration testing. This allows tests to run against a realistic Presto-like HTTP server without requiring an actual Presto cluster.

Basic usage

import (
    presto "github.com/ethanyzhang/presto-go"
    "github.com/ethanyzhang/presto-go/prestotest"
)

func TestMyFeature(t *testing.T) {
    mock := prestotest.NewMockPrestoServer()
    defer mock.Close()

    // Register a query with expected SQL and mock response
    mock.AddQuery(&prestotest.MockQueryTemplate{
        SQL:     "SELECT id, name FROM users",
        Columns: []presto.Column{
            {Name: "id", Type: "bigint"},
            {Name: "name", Type: "varchar"},
        },
        Data: [][]any{{1, "alice"}, {2, "bob"}},
    })

    // Create a client pointing at the mock server
    client, _ := presto.NewClient(mock.URL())

    // Use the client in your test...
}

Multi-batch results

Use DataBatches to split results across multiple HTTP responses, simulating how Presto returns large result sets:

mock.AddQuery(&prestotest.MockQueryTemplate{
    SQL:         "SELECT id FROM large_table",
    Columns:     []presto.Column{{Name: "id", Type: "bigint"}},
    Data:        [][]any{{1}, {2}, {3}, {4}, {5}, {6}},
    DataBatches: 3, // Split 6 rows into 3 batches of 2
})

Simulating errors

Use Error and QueueBatches to simulate query failures that appear during result fetching (matching real Presto behavior where errors often appear after the initial queued state):

mock.AddQuery(&prestotest.MockQueryTemplate{
    SQL:          "SELECT * FROM nonexistent",
    QueueBatches: 2,
    Error: &presto.QueryError{
        ErrorName: "TABLE_NOT_FOUND",
        Message:   "Table does not exist",
        ErrorCode: 1,
        ErrorType: "USER_ERROR",
    },
})

Latency simulation

Use SetDefaultLatency to add artificial latency, useful for testing cancellation and timeout behavior:

mock.SetDefaultLatency(100 * time.Millisecond)

Testing prestoapi.QueryAndUnmarshal

For code that uses prestoapi.QueryAndUnmarshal, the mock server provides end-to-end testing from query submission through result unmarshalling:

type UserRow struct {
    ID   int    `presto:"id"`
    Name string `presto:"name"`
}

var results []UserRow
err := prestoapi.QueryAndUnmarshal(ctx, &client.Session, "SELECT id, name FROM users", &results)

Testing Stage Execution

The stage package has helper functions for testing full stage graph execution. See stage/stage_test.go for examples of:

  • Parsing and executing multi-stage benchmark graphs
  • Random execution mode
  • Cold/warm runs
  • Expected row count validation
  • Save output verification
  • Context cancellation and abort-on-error behavior

The pattern is:

mock := setupMockServer()
defer mock.Close()

stage, _, _ := ParseStageGraphFromFile("path/to/stage.json")
stage.InitStates()
stage.States.NewClient = func() *presto.Client {
    client, _ := presto.NewClient(mock.URL())
    return client
}
stage.Run(context.Background())

Releasing

Via GitHub Actions (recommended)

Go to Actions > Release > Run workflow on the GitHub repository page. Fill in the version number and optional release notes, then click Run workflow. The workflow will cross-compile, package, and create the release automatically.

Via command line

Requires the GitHub CLI (gh) to be installed and authenticated.

make release VERSION=1.2

What gets released

Both methods produce the same output:

  1. Cross-compile for all platforms (darwin/linux, amd64/arm64)
  2. Package each binary with benchmarks and cluster templates into a tarball
  3. Create a GitHub release tagged v1.2 and upload the four tarballs

The release will contain:

  • pbench_darwin_amd64.tar.gz
  • pbench_darwin_arm64.tar.gz
  • pbench_linux_amd64.tar.gz
  • pbench_linux_arm64.tar.gz
  • Source code archives (added automatically by GitHub)

Project Structure

pbench/
  main.go                  # Entry point
  cmd/                     # Cobra command definitions
    cmd.go                 # Root command
    run.go, save.go, ...   # Subcommand wiring (flags, args)
    run/                   # pbench run implementation
    save/                  # pbench save implementation
    loadjson/              # pbench loadjson implementation
    forward/               # pbench forward implementation
    cmp/                   # pbench cmp implementation
    replay/                # pbench replay implementation
    round/                 # pbench round implementation
    genconfig/             # pbench genconfig implementation
    genddl/                # pbench genddl implementation
    queryplan/             # pbench queryplan implementation
  stage/                   # Stage graph parsing, execution, run recorders
  prestoapi/               # Query unmarshalling, SQL splitter, plan node parsing
  utils/                   # Shared utilities (ORM, Row, PrestoFlags, etc.)
  log/                     # Logging wrapper (zerolog)
  clusters/                # Cluster configuration types
  benchmarks/              # Benchmark definition files (TPC-DS, TPC-H, test fixtures)
  .github/workflows/       # CI pipeline

Build Tags

  • influx — Enables InfluxDB run recorder (stage/influx_run_recorder.go). Without this tag, a no-op stub is used.
  • experimental — Enables experimental commands (e.g., pbench round).

Command Architecture

Each subcommand lives in cmd/<name>/main.go with a corresponding cmd/<name>.go for cobra wiring (flags, args). Package-level variables for flags and runtime state is the standard cobra pattern — each package corresponds to exactly one command, and only one command runs per process, so this is fine.

Concurrency Pattern

Several commands (save, forward, replay, loadjson) share the same parallelism pattern:

var (
    parallelismGuard chan struct{}   // semaphore for max concurrency
    runningTasks     sync.WaitGroup // tracks in-flight goroutines
)

This could potentially be extracted into a shared WorkerPool utility if more commands adopt this pattern.

Notable Design Choices

Command Notes
run Delegates orchestration to stage/ package. Cleanest separation of concerns. prepareClient() auto-detects catalog/schema/timezone mismatches and creates a new client when needed.
forward Most complex command — polls source cluster, filters/transforms queries, forwards to N target clusters with a query cache for cancellation propagation.
loadjson Creates a pseudo-stage to reuse run recorder infrastructure. Uses syncedTime to infer run start/end from individual query timestamps.
save handleQueryError in table_summary.go returns (retry, fatal) — callers decide whether to retry or abort. Fatal errors from metadata queries (SHOW CREATE TABLE, SHOW STATS, DESCRIBE) stop the current table; non-fatal errors in partition loops are retried.
genconfig Uses generic map[string]any for parameters instead of typed structs. Computation logic lives in a .prelude template file using set to mutate the map. The -p flag is repeatable for stacking parameter files. Templates reference snake_case JSON keys directly.
genddl Defaults to TPC-DS but configurable via workload and workload_definition fields in config file. Paths resolved relative to config file directory.
queryplan Parses query plan JSON from CSV files and outputs join information as JSON. Uses a string flag output for the file path and a separate outputFile variable for the *os.File.

Error Handling Philosophy

Commands are generally continue-on-error: individual item failures (queries, files, tables) are logged but don't abort the entire operation. This is intentional — for batch operations like loading 1000 JSON files or saving 200 table summaries, a single failure shouldn't stop the whole run. The only exceptions are truly unrecoverable errors (can't connect to server, can't open input file) which use log.Fatal().

Clone this wiki locally