Skip to content

[plan] Add unit tests for benchmark statistics and threshold logic #1761

@github-actions

Description

@github-actions

Objective

Add unit tests for scripts/ci/benchmark-performance.ts to verify the statistics calculation, threshold checking, regression detection logic, and JSON report structure — without requiring Docker or awf to be installed.

Context

The benchmark script has non-trivial logic that should be testable in isolation:

  • stats(values) — computes mean, median, p95, p99
  • Threshold comparisons (r.p95 > threshold.critical)
  • BenchmarkReport JSON structure and field types

Currently there are zero tests for the benchmark script. Adding tests ensures the statistical calculations are correct and prevents regressions in the benchmark tooling itself.

Approach

Extract the pure logic functions from benchmark-performance.ts into a separate module so they can be imported without side effects:

  1. Refactor benchmark-performance.ts: Extract stats(), threshold comparison logic, and report building into a new file scripts/ci/benchmark-utils.ts that has no Docker/exec dependencies.

  2. Create scripts/ci/benchmark-utils.test.ts (or place in src/ for Jest to pick up):

import { stats, checkThresholds } from './benchmark-utils';

describe('stats()', () => {
  it('computes mean correctly', () => {
    expect(stats([10, 20, 30]).mean).toBe(20);
  });
  it('computes median correctly for odd count', () => {
    expect(stats([1, 3, 5]).median).toBe(3);
  });
  it('computes p95 for small arrays', () => {
    const values = Array.from({ length: 20 }, (_, i) => i * 10);
    expect(stats(values).p95).toBe(190);
  });
  it('handles single-element arrays', () => {
    const result = stats([42]);
    expect(result.mean).toBe(42);
    expect(result.p99).toBe(42);
  });
});

describe('checkThresholds()', () => {
  it('detects critical threshold breach', () => {
    const regressions = checkThresholds([
      { metric: 'container_startup_warm', unit: 'ms', values: [], mean: 0, median: 0, p95: 9000, p99: 9000 },
    ]);
    expect(regressions).toHaveLength(1);
    expect(regressions[0]).toContain('container_startup_warm');
  });
  it('returns no regressions when within threshold', () => {
    const regressions = checkThresholds([
      { metric: 'container_startup_warm', unit: 'ms', values: [], mean: 0, median: 0, p95: 3000, p99: 3000 },
    ]);
    expect(regressions).toHaveLength(0);
  });
});
  1. Update jest.config.js or tsconfig.json if needed to include scripts/ in the test scan path.

Files to Create/Modify

  • Create: scripts/ci/benchmark-utils.ts — extracted pure logic (stats, threshold checking, report building)
  • Create: scripts/ci/benchmark-utils.test.ts (or src/benchmark-utils.test.ts) — unit tests
  • Modify: scripts/ci/benchmark-performance.ts — import from benchmark-utils.ts instead of defining inline
  • Possibly modify: jest.config.js — include scripts/ directory in test scan

Acceptance Criteria

  • npm test runs the new unit tests without requiring Docker
  • stats() function is tested for mean, median, p95, p99 with at least 5 test cases
  • Threshold breach detection is tested for both breach and no-breach cases
  • The benchmark script still runs correctly end-to-end (existing functionality not broken)
    Related to [Long-term] Add performance benchmarking suite #240

Generated by Plan Command for issue #240 · ● 787.3K ·

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions