Skip to content

feat: add performance benchmarking suite#258

Closed
Copilot wants to merge 4 commits intomainfrom
copilot/add-performance-benchmarking-suite
Closed

feat: add performance benchmarking suite#258
Copilot wants to merge 4 commits intomainfrom
copilot/add-performance-benchmarking-suite

Conversation

Copy link
Contributor

Copilot AI commented Jan 17, 2026

Implements a performance benchmarking suite to track and prevent performance regressions over time, as outlined in the CI/CD Gap Assessment.

Benchmark Infrastructure

  • src/benchmarks/benchmark-types.ts - Types for results, stats, reports, regression detection
  • src/benchmarks/benchmark-runner.ts - Runner class with statistical analysis (min/max/mean/median/stdDev), regression detection, markdown formatting
  • Unit tests for benchmark utilities (13 tests)

Metrics Tracked

  • Container startup time
  • HTTP request latency through proxy
  • File download time through proxy
  • Combined container memory usage
  • Blocked domain rejection time

CI Integration

  • .github/workflows/benchmark.yml - Runs on push/PR to main, uploads JSON report artifact, generates GitHub Actions summary
  • scripts/ci/generate-benchmark-summary.ts - Parses Jest output into markdown summary

Usage

# Run benchmarks locally (requires sudo for iptables)
sudo -E npm run test:benchmark

# View report
cat /tmp/awf-benchmark-report.json | jq

Regression Detection

const thresholds: RegressionThreshold[] = [
  { metric: 'startup_time_ms', maxIncreasePercent: 20 },
  { metric: 'memory_mb', maxIncreasePercent: 25 },
];

const result = detectRegressions(current, baseline, thresholds);
// result.hasRegression, result.regressions, result.improvements
Original prompt

This section details on the original issue you should resolve

<issue_title>[Long-term] Add performance benchmarking suite</issue_title>
<issue_description>## Context
From CI/CD Pipeline Gap Assessment (Discussion #227)

Description

Create a performance benchmarking suite to track and prevent performance regressions over time.

Acceptance Criteria

  • Define key performance metrics to track (e.g., startup time, memory usage, throughput)
  • Implement benchmark tests
  • Set up automated benchmark execution in CI
  • Track benchmark results over time
  • Alert on significant performance regressions

Estimated Effort

To be determined (quarterly goal)

Related

Comments on the Issue (you are @copilot in this section)


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 2 commits January 17, 2026 05:03
Co-authored-by: Mossaka <5447827+Mossaka@users.noreply.github.com>
Co-authored-by: Mossaka <5447827+Mossaka@users.noreply.github.com>
Copilot AI changed the title [WIP] Add performance benchmarking suite to track regressions feat: add performance benchmarking suite Jan 17, 2026
Copilot AI requested a review from Mossaka January 17, 2026 05:08
@Mossaka Mossaka marked this pull request as ready for review January 17, 2026 09:10
@github-actions
Copy link
Contributor

github-actions bot commented Jan 17, 2026

⚠️ Coverage Regression Detected

This PR decreases test coverage. Please add tests to maintain coverage levels.

Overall Coverage

Metric Base PR Delta
Lines 77.19% 76.28% 📉 -0.91%
Statements 77.27% 76.35% 📉 -0.92%
Functions 77.17% 74.38% 📉 -2.79%
Branches 69.76% 69.12% 📉 -0.64%
✨ New Files (1 files)
  • src/benchmarks/benchmark-runner.ts: 64.0% lines

Coverage comparison generated by scripts/ci/compare-coverage.ts

// Generate and save benchmark report
const report = await benchmarkRunner.generateReport();
const reportPath = path.join(os.tmpdir(), 'awf-benchmark-report.json');
fs.writeFileSync(reportPath, JSON.stringify(report, null, 2));

Check failure

Code scanning / CodeQL

Insecure temporary file High test

Insecure creation of file in
the os temp dir
.

Copilot Autofix

AI 20 days ago

In general, to fix insecure temp file creation, avoid manually constructing paths under os.tmpdir() and instead use a secure temp-file API that (a) creates the file atomically, (b) ensures it doesn’t already exist, and (c) sets safe permissions. In Node.js, a common approach is to use the tmp library’s fileSync() / file() functions to obtain a securely created temporary file, then write to that file.

For this specific code, the safest non-breaking fix is:

  • Replace the manual path.join(os.tmpdir(), 'awf-benchmark-report.json') construction with a call to tmp.fileSync, using a prefix/suffix so the filename is recognizable but still unique.
  • Use the returned .name as reportPath and continue to write to it with fs.writeFileSync exactly as before.
  • Add an import for tmp at the top of tests/benchmarks/performance.benchmark.ts.

Concrete changes in tests/benchmarks/performance.benchmark.ts:

  1. Add import * as tmp from 'tmp'; alongside the existing imports.

  2. In afterAll, change:

    const reportPath = path.join(os.tmpdir(), 'awf-benchmark-report.json');
    fs.writeFileSync(reportPath, JSON.stringify(report, null, 2));

    to:

    const tmpFile = tmp.fileSync({ prefix: 'awf-benchmark-report-', postfix: '.json' });
    const reportPath = tmpFile.name;
    fs.writeFileSync(reportPath, JSON.stringify(report, null, 2));

This preserves functionality (a JSON benchmark report saved to a temp file and its path logged) while ensuring the temp file is created securely.

Suggested changeset 2
tests/benchmarks/performance.benchmark.ts

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/tests/benchmarks/performance.benchmark.ts b/tests/benchmarks/performance.benchmark.ts
--- a/tests/benchmarks/performance.benchmark.ts
+++ b/tests/benchmarks/performance.benchmark.ts
@@ -16,6 +16,7 @@
 import * as fs from 'fs';
 import * as path from 'path';
 import * as os from 'os';
+import * as tmp from 'tmp';
 import execa = require('execa');
 
 describe('Performance Benchmarks', () => {
@@ -38,7 +39,8 @@
   afterAll(async () => {
     // Generate and save benchmark report
     const report = await benchmarkRunner.generateReport();
-    const reportPath = path.join(os.tmpdir(), 'awf-benchmark-report.json');
+    const tmpFile = tmp.fileSync({ prefix: 'awf-benchmark-report-', postfix: '.json' });
+    const reportPath = tmpFile.name;
     fs.writeFileSync(reportPath, JSON.stringify(report, null, 2));
     console.log(`\nBenchmark report saved to: ${reportPath}`);
 
EOF
@@ -16,6 +16,7 @@
import * as fs from 'fs';
import * as path from 'path';
import * as os from 'os';
import * as tmp from 'tmp';
import execa = require('execa');

describe('Performance Benchmarks', () => {
@@ -38,7 +39,8 @@
afterAll(async () => {
// Generate and save benchmark report
const report = await benchmarkRunner.generateReport();
const reportPath = path.join(os.tmpdir(), 'awf-benchmark-report.json');
const tmpFile = tmp.fileSync({ prefix: 'awf-benchmark-report-', postfix: '.json' });
const reportPath = tmpFile.name;
fs.writeFileSync(reportPath, JSON.stringify(report, null, 2));
console.log(`\nBenchmark report saved to: ${reportPath}`);

package.json
Outside changed files

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/package.json b/package.json
--- a/package.json
+++ b/package.json
@@ -43,7 +43,8 @@
     "chalk": "^4.1.2",
     "commander": "^12.0.0",
     "execa": "^5.1.1",
-    "js-yaml": "^4.1.1"
+    "js-yaml": "^4.1.1",
+    "tmp": "^0.2.5"
   },
   "devDependencies": {
     "@commitlint/cli": "^20.1.0",
EOF
@@ -43,7 +43,8 @@
"chalk": "^4.1.2",
"commander": "^12.0.0",
"execa": "^5.1.1",
"js-yaml": "^4.1.1"
"js-yaml": "^4.1.1",
"tmp": "^0.2.5"
},
"devDependencies": {
"@commitlint/cli": "^20.1.0",
This fix introduces these dependencies
Package Version Security advisories
tmp (npm) 0.2.5 None
Copilot is powered by AI and may make mistakes. Always verify output.
*/

import * as fs from 'fs';
import * as path from 'path';

Check notice

Code scanning / CodeQL

Unused variable, import, function or class Note

Unused import path.

Copilot Autofix

AI 20 days ago

In general, unused imports should be removed to make the code clearer and avoid confusion about their purpose. For this file, the simplest, behavior‑preserving fix is to delete the unused path import line.

Concretely, in scripts/ci/generate-benchmark-summary.ts, remove line 9: import * as path from 'path';. No other code changes are needed because nothing references path. No additional methods, definitions, or imports are required.

Suggested changeset 1
scripts/ci/generate-benchmark-summary.ts

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/scripts/ci/generate-benchmark-summary.ts b/scripts/ci/generate-benchmark-summary.ts
--- a/scripts/ci/generate-benchmark-summary.ts
+++ b/scripts/ci/generate-benchmark-summary.ts
@@ -6,7 +6,6 @@
  */
 
 import * as fs from 'fs';
-import * as path from 'path';
 
 interface BenchmarkMetric {
   name: string;
EOF
@@ -6,7 +6,6 @@
*/

import * as fs from 'fs';
import * as path from 'path';

interface BenchmarkMetric {
name: string;
Copilot is powered by AI and may make mistakes. Always verify output.
@github-actions
Copy link
Contributor

github-actions bot commented Jan 17, 2026

📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤

@github-actions
Copy link
Contributor

github-actions bot commented Jan 17, 2026

🎬 THE ENDSmoke Claude MISSION: ACCOMPLISHED! The hero saves the day! ✨

@github-actions
Copy link
Contributor

github-actions bot commented Jan 17, 2026

🌑 The shadows whisper... Smoke Codex failed. The oracle requires further meditation...

@github-actions
Copy link
Contributor

🚨 Smoke Test Results: FAIL

Last 2 merged PRs (@Mossaka):

Test Results:

  • ✅ GitHub MCP: Retrieved PR data successfully
  • ❌ Playwright: Browser download blocked by firewall
  • ✅ File Writing: Created /tmp/gh-aw/agent/smoke-test-copilot-21092103208.txt
  • ✅ Bash Tool: Verified file content

Status: FAIL (Playwright blocked)

AI generated by Smoke Copilot

@Mossaka Mossaka closed this Jan 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Long-term] Add performance benchmarking suite

2 participants