feat: add performance benchmarking suite by Copilot · Pull Request #258 · github/gh-aw-firewall

Copilot · 2026-01-17T04:22:05Z

Implements a performance benchmarking suite to track and prevent performance regressions over time, as outlined in the CI/CD Gap Assessment.

Benchmark Infrastructure

src/benchmarks/benchmark-types.ts - Types for results, stats, reports, regression detection
src/benchmarks/benchmark-runner.ts - Runner class with statistical analysis (min/max/mean/median/stdDev), regression detection, markdown formatting
Unit tests for benchmark utilities (13 tests)

Metrics Tracked

Container startup time
HTTP request latency through proxy
File download time through proxy
Combined container memory usage
Blocked domain rejection time

CI Integration

.github/workflows/benchmark.yml - Runs on push/PR to main, uploads JSON report artifact, generates GitHub Actions summary
scripts/ci/generate-benchmark-summary.ts - Parses Jest output into markdown summary

Usage

# Run benchmarks locally (requires sudo for iptables)
sudo -E npm run test:benchmark

# View report
cat /tmp/awf-benchmark-report.json | jq

Regression Detection

const thresholds: RegressionThreshold[] = [
  { metric: 'startup_time_ms', maxIncreasePercent: 20 },
  { metric: 'memory_mb', maxIncreasePercent: 25 },
];

const result = detectRegressions(current, baseline, thresholds);
// result.hasRegression, result.regressions, result.improvements

Original prompt

This section details on the original issue you should resolve

<issue_title>[Long-term] Add performance benchmarking suite</issue_title>
<issue_description>## Context
From CI/CD Pipeline Gap Assessment (Discussion #227)

Description

Create a performance benchmarking suite to track and prevent performance regressions over time.

Acceptance Criteria

Define key performance metrics to track (e.g., startup time, memory usage, throughput)

Implement benchmark tests

Set up automated benchmark execution in CI

Track benchmark results over time

Alert on significant performance regressions

Estimated Effort

To be determined (quarterly goal)

Related

Discussion CI/CD Pipeline and Integration Tests - Gap Assessment #227</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes [Long-term] Add performance benchmarking suite #240

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: Mossaka <5447827+Mossaka@users.noreply.github.com>

github-actions · 2026-01-17T09:10:32Z

⚠️ Coverage Regression Detected

This PR decreases test coverage. Please add tests to maintain coverage levels.

Overall Coverage

Metric	Base	PR	Delta
Lines	77.19%	76.28%	📉 -0.91%
Statements	77.27%	76.35%	📉 -0.92%
Functions	77.17%	74.38%	📉 -2.79%
Branches	69.76%	69.12%	📉 -0.64%

✨ New Files (1 files)

src/benchmarks/benchmark-runner.ts: 64.0% lines

Coverage comparison generated by scripts/ci/compare-coverage.ts

tests/benchmarks/performance.benchmark.ts

+    // Generate and save benchmark report
+    const report = await benchmarkRunner.generateReport();
+    const reportPath = path.join(os.tmpdir(), 'awf-benchmark-report.json');
+    fs.writeFileSync(reportPath, JSON.stringify(report, null, 2));


In general, to fix insecure temp file creation, avoid manually constructing paths under os.tmpdir() and instead use a secure temp-file API that (a) creates the file atomically, (b) ensures it doesn’t already exist, and (c) sets safe permissions. In Node.js, a common approach is to use the tmp library’s fileSync() / file() functions to obtain a securely created temporary file, then write to that file.

For this specific code, the safest non-breaking fix is:

Replace the manual path.join(os.tmpdir(), 'awf-benchmark-report.json') construction with a call to tmp.fileSync, using a prefix/suffix so the filename is recognizable but still unique.

Use the returned .name as reportPath and continue to write to it with fs.writeFileSync exactly as before.

Add an import for tmp at the top of tests/benchmarks/performance.benchmark.ts.

Concrete changes in tests/benchmarks/performance.benchmark.ts:

Add import * as tmp from 'tmp'; alongside the existing imports.

In afterAll, change:

const reportPath = path.join(os.tmpdir(), 'awf-benchmark-report.json'); fs.writeFileSync(reportPath, JSON.stringify(report, null, 2));

to:

const tmpFile = tmp.fileSync({ prefix: 'awf-benchmark-report-', postfix: '.json' }); const reportPath = tmpFile.name; fs.writeFileSync(reportPath, JSON.stringify(report, null, 2));

This preserves functionality (a JSON benchmark report saved to a temp file and its path logged) while ensuring the temp file is created securely.

scripts/ci/generate-benchmark-summary.ts

+ */
+
+import * as fs from 'fs';
+import * as path from 'path';


In general, unused imports should be removed to make the code clearer and avoid confusion about their purpose. For this file, the simplest, behavior‑preserving fix is to delete the unused path import line.

Concretely, in scripts/ci/generate-benchmark-summary.ts, remove line 9: import * as path from 'path';. No other code changes are needed because nothing references path. No additional methods, definitions, or imports are required.

…ce-benchmarking-suite

github-actions · 2026-01-17T09:25:57Z

📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤

github-actions · 2026-01-17T09:26:01Z

🎬 THE END — Smoke Claude MISSION: ACCOMPLISHED! The hero saves the day! ✨

github-actions · 2026-01-17T09:26:04Z

🌑 The shadows whisper... Smoke Codex failed. The oracle requires further meditation...

github-actions · 2026-01-17T09:28:30Z

🚨 Smoke Test Results: FAIL

Last 2 merged PRs (@Mossaka):

chore: delete existing firewall tests and migrate smoke tests #229: chore: delete existing firewall tests and migrate smoke tests
feat(ci): add coverage regression detection #244: feat(ci): add coverage regression detection

Test Results:

✅ GitHub MCP: Retrieved PR data successfully
❌ Playwright: Browser download blocked by firewall
✅ File Writing: Created /tmp/gh-aw/agent/smoke-test-copilot-21092103208.txt
✅ Bash Tool: Verified file content

Status: FAIL (Playwright blocked)

AI generated by Smoke Copilot

Initial plan

eb733ef

Copilot AI assigned Copilot and Mossaka Jan 17, 2026

Copilot started work on behalf of Mossaka January 17, 2026 04:53 View session

Copilot AI and others added 2 commits January 17, 2026 05:03

feat: add performance benchmarking suite with CI workflow

ae6b469

Co-authored-by: Mossaka <5447827+Mossaka@users.noreply.github.com>

fix: address code review feedback for benchmark tests

3ff32ee

Co-authored-by: Mossaka <5447827+Mossaka@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Add performance benchmarking suite to track regressions~~ feat: add performance benchmarking suite Jan 17, 2026

Copilot finished work on behalf of Mossaka January 17, 2026 05:08

Copilot AI requested a review from Mossaka January 17, 2026 05:08

Mossaka marked this pull request as ready for review January 17, 2026 09:10

github-advanced-security bot found potential problems Jan 17, 2026

View reviewed changes

Merge remote-tracking branch 'origin/main' into copilot/add-performan…

8c9bce5

…ce-benchmarking-suite

Mossaka closed this Jan 17, 2026

@@ -43,7 +43,8 @@
                 "chalk": "^4.1.2",
                 "commander": "^12.0.0",
                 "execa": "^5.1.1",
-                "js-yaml": "^4.1.1"
+                "js-yaml": "^4.1.1",
+                "tmp": "^0.2.5"
               },
               "devDependencies": {
                 "@commitlint/cli": "^20.1.0",

Package	Version	Security advisories
tmp (npm)	0.2.5	None

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add performance benchmarking suite#258

feat: add performance benchmarking suite#258
Copilot wants to merge 4 commits intomainfrom
copilot/add-performance-benchmarking-suite

Copilot AI commented Jan 17, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 17, 2026 •

edited

Loading

Uh oh!

Check failure

Copilot Autofix

Check notice

Copilot Autofix

github-actions bot commented Jan 17, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 17, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 17, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

@@ -16,6 +16,7 @@
             import * as fs from 'fs';
             import * as path from 'path';
             import * as os from 'os';
+            import * as tmp from 'tmp';
             import execa = require('execa');
             describe('Performance Benchmarks', () => {
@@ -38,7 +39,8 @@
               afterAll(async () => {
                 // Generate and save benchmark report
                 const report = await benchmarkRunner.generateReport();
-                const reportPath = path.join(os.tmpdir(), 'awf-benchmark-report.json');
+                const tmpFile = tmp.fileSync({ prefix: 'awf-benchmark-report-', postfix: '.json' });
+                const reportPath = tmpFile.name;
                 fs.writeFileSync(reportPath, JSON.stringify(report, null, 2));
                 console.log(`\nBenchmark report saved to: ${reportPath}`);

@@ -6,7 +6,6 @@
              */
             import * as fs from 'fs';
-            import * as path from 'path';
             interface BenchmarkMetric {
               name: string;

Conversation

Copilot AI commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Infrastructure

Metrics Tracked

CI Integration

Usage

Regression Detection

Description

Acceptance Criteria

Estimated Effort

Related

Comments on the Issue (you are @copilot in this section)

Uh oh!

github-actions bot commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Coverage Regression Detected

Overall Coverage

Uh oh!

Check failure

Uh oh!

Copilot Autofix

Check notice

Copilot Autofix

github-actions bot commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jan 17, 2026

🚨 Smoke Test Results: FAIL

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Jan 17, 2026 •

edited

Loading

github-actions bot commented Jan 17, 2026 •

edited

Loading

github-actions bot commented Jan 17, 2026 •

edited

Loading

github-actions bot commented Jan 17, 2026 •

edited

Loading

github-actions bot commented Jan 17, 2026 •

edited

Loading