refactor: extract shared MCP script handler execution envelope by Copilot · Pull Request #29497 · github/gh-aw

Copilot · 2026-05-01T09:52:09Z

The Go, Python, JavaScript, and Shell MCP script handlers each duplicated ~70 lines of identical process execution logic — execFile setup, stdout/stderr logging, timeout/maxBuffer, JSON parsing, and MCP content wrapping. Error handling had already diverged: JS and Shell produced enhanced errors with stdout/stderr context; Go and Python forwarded the raw error.

Changes

New mcp_handler_process.cjs — shared execution envelope exposing:
- executeProcess(opts) — common execFile wrapper with onError (cleanup hook) and buildResult (custom result mapper) extension points
- buildEnhancedError(error, scriptPath, stdout, stderr) — informative errors with stdout/stderr context
- parseStdoutAsJson(stdout, stderr, onParseFailure) — JSON parse with text fallback
- wrapMCPContent(result) — MCP content response construction
Refactored handlers — all four handlers delegate to executeProcess; Go and Python error handling normalized to enhanced errors, matching JS/Shell
Shell-specific extensions preserved — onError handles GITHUB_OUTPUT file cleanup; buildResult handles key=value output parsing; cleanupOutputFile promoted to module-level
New tests — mcp_handler_process.test.cjs (26 tests), mcp_handler_python.test.cjs (7 tests, previously missing), mcp_handler_shell.test.cjs (11 tests, previously missing)

// Before: each handler owned its own execFile + error + JSON + MCP wrapping (~70 lines each)
// After: handlers reduce to preamble + one executeProcess call
return executeProcess({
  server, toolName, languageLabel: "Go",
  command: "go", args: ["run", scriptPath],
  env: process.env, inputJson, timeoutSeconds, scriptPath,
});

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/b0f10b19-1533-4c55-af8c-4575d5caa4e2 Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>

Copilot

Pull request overview

Refactors MCP script handlers (Go/Python/JavaScript/Shell) to share a single process-execution “envelope” for consistent execFile behavior, logging, stdout parsing, and error enrichment, and adds targeted tests for the shared module and previously-untested handlers.

Changes:

Added mcp_handler_process.cjs to centralize process execution, enhanced errors, JSON parsing, and MCP response wrapping.
Updated Go/Python/JavaScript/Shell handlers to delegate execution to executeProcess, preserving Shell-specific output-file parsing/cleanup via hooks.
Added new Vitest coverage for the shared execution helper and for Python/Shell handlers.

Show a summary per file

File	Description
actions/setup/js/mcp_handler_process.cjs	New shared execFile wrapper + enhanced error/message building + stdout JSON parsing + MCP content wrapping.
actions/setup/js/mcp_handler_process.test.cjs	Unit tests for enhanced errors, JSON parsing behavior, MCP wrapping, and executeProcess hooks.
actions/setup/js/mcp_handler_shell.cjs	Refactors shell handler to use `executeProcess` with `onError` cleanup and `buildResult` output-file parsing.
actions/setup/js/mcp_handler_shell.test.cjs	New tests covering env input mapping, output parsing, cleanup behavior, timeout, and cwd behavior for shell scripts.
actions/setup/js/mcp_handler_python.cjs	Refactors python handler to use `executeProcess` (stdin JSON, enhanced errors, parsing).
actions/setup/js/mcp_handler_python.test.cjs	New tests covering Python execution, JSON/non-JSON output behavior, timeouts, cwd behavior, and enhanced errors.
actions/setup/js/mcp_handler_javascript.cjs	Refactors JS handler to use `executeProcess` while keeping stdin JSON behavior.
actions/setup/js/mcp_handler_go.cjs	Refactors Go handler to use `executeProcess` while keeping stdin JSON behavior.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 8/8 changed files
Comments generated: 3

+ * @param {Error} error - The original execution error
+ * @param {string} scriptPath - Path to the script, used for context in the message
+ * @param {string} stdout - Process stdout output
+ * @param {string} stderr - Process stderr output
+ * @returns {Error} Enhanced error with stdout/stderr context
+ */
+function buildEnhancedError(error, scriptPath, stdout, stderr) {
+  const exitCode = typeof error.code === "number" ? error.code : 1;
+  const parts = [`Command failed: ${scriptPath} (exit code: ${exitCode})`];
+  if (stderr && stderr.trim()) {
+    parts.push(`stderr:\n${stderr.trim()}`);
+  }
+  if (stdout && stdout.trim()) {
+    parts.push(`stdout:\n${stdout.trim()}`);
+  }
+  return new Error(parts.join("\n"));


+        if (error) {
+          server.debugError(`  [${toolName}] ${languageLabel} script error: `, error);
+          if (onError) {
+            onError(error, stdout, stderr);


+        if (buildResult) {
+          result = buildResult(stdout, stderr);
+        } else {
+          result = parseStdoutAsJson(stdout, stderr, () => {
+            server.debug(`  [${toolName}] Output is not JSON, returning as text`);
+          });


github-actions · 2026-05-01T10:19:07Z

🧪 Test Quality Sentinel Report

Test Quality Score: 80/100

✅ Excellent test quality

Metric	Value
New/modified tests analyzed	44
✅ Design tests (behavioral contracts)	44 (100%)
⚠️ Implementation tests (low value)	0 (0%)
Tests with error/edge cases	35 (80%)
Duplicate test clusters	0
Test inflation detected	⚠️ Yes — 2 files exceed 2:1 ratio (refactoring context)
🚨 Coding-guideline violations	None

Test Classification Details

View all 44 test classifications

Test	File	Classification	Notes
`should include script path and exit code`	`mcp_handler_process.test.cjs`	✅ Design	Verifies error message content
`should include stderr when present`	`mcp_handler_process.test.cjs`	✅ Design	Error path covered
`should include stdout when present`	`mcp_handler_process.test.cjs`	✅ Design	Behavioral output check
`should omit empty stdout and stderr`	`mcp_handler_process.test.cjs`	✅ Design	Edge case: empty inputs
`should omit whitespace-only stdout and stderr`	`mcp_handler_process.test.cjs`	✅ Design	Edge case: whitespace inputs
`should fall back to exit code 1 for non-numeric error codes`	`mcp_handler_process.test.cjs`	✅ Design	Edge case: non-numeric code
`should include both stdout and stderr when both are present`	`mcp_handler_process.test.cjs`	✅ Design	Both outputs present
`should parse valid JSON stdout`	`mcp_handler_process.test.cjs`	✅ Design	Happy path
`should trim whitespace before parsing JSON`	`mcp_handler_process.test.cjs`	✅ Design	Edge case: whitespace trimming
`should return stdout/stderr object when stdout is empty`	`mcp_handler_process.test.cjs`	✅ Design	Edge case: empty stdout
`should return stdout/stderr object when stdout is whitespace only`	`mcp_handler_process.test.cjs`	✅ Design	Edge case: whitespace stdout
`should return stdout/stderr object when JSON parsing fails`	`mcp_handler_process.test.cjs`	✅ Design	Error path: bad JSON
`should call onParseFailure when JSON parsing fails`	`mcp_handler_process.test.cjs`	✅ Design	Observable callback side-effect
`should not call onParseFailure for valid JSON`	`mcp_handler_process.test.cjs`	✅ Design	Negative case
`should not call onParseFailure for empty stdout`	`mcp_handler_process.test.cjs`	✅ Design	Negative case: empty
`should wrap result in MCP content format`	`mcp_handler_process.test.cjs`	✅ Design	Contract: MCP format shape
`should JSON-stringify the result`	`mcp_handler_process.test.cjs`	✅ Design	Output serialization
`should execute a process and return MCP content`	`mcp_handler_process.test.cjs`	✅ Design	E2E process execution
`should write inputJson to stdin`	`mcp_handler_process.test.cjs`	✅ Design	Input propagation contract
`should reject with enhanced error on process failure`	`mcp_handler_process.test.cjs`	✅ Design	Error path: non-zero exit
`should include stderr in enhanced error`	`mcp_handler_process.test.cjs`	✅ Design	Error message content
`should call onError hook before rejecting`	`mcp_handler_process.test.cjs`	✅ Design	Observable hook side-effect
`should use buildResult when provided`	`mcp_handler_process.test.cjs`	✅ Design	Custom result builder
`should fall back to stdout/stderr object for non-JSON output`	`mcp_handler_process.test.cjs`	✅ Design	Edge case: non-JSON stdout
`should respect the timeout setting`	`mcp_handler_process.test.cjs`	✅ Design	Timeout enforcement
`should use GITHUB_WORKSPACE as cwd when set`	`mcp_handler_process.test.cjs`	✅ Design	Env-driven cwd behavior
`should execute a simple Python script`	`mcp_handler_python.test.cjs`	✅ Design	E2E Python execution
`should handle Python script with no input`	`mcp_handler_python.test.cjs`	✅ Design	Edge case: no input
`should handle non-JSON output`	`mcp_handler_python.test.cjs`	✅ Design	Edge case: non-JSON
`should respect timeout setting`	`mcp_handler_python.test.cjs`	✅ Design	Timeout enforcement
`should handle Python script errors with enhanced error message`	`mcp_handler_python.test.cjs`	✅ Design	Error message content
`should pass complex input data`	`mcp_handler_python.test.cjs`	✅ Design	Complex data propagation
`should execute script from GITHUB_WORKSPACE directory`	`mcp_handler_python.test.cjs`	✅ Design	Env-driven cwd behavior
`should execute a simple shell script`	`mcp_handler_shell.test.cjs`	✅ Design	E2E shell execution
`should pass inputs as environment variables`	`mcp_handler_shell.test.cjs`	✅ Design	Input→env propagation
`should convert hyphenated input keys to INPUT_NAME format`	`mcp_handler_shell.test.cjs`	✅ Design	Key normalization contract
`should capture stdout and stderr`	`mcp_handler_shell.test.cjs`	✅ Design	Output capture
`should return empty outputs when GITHUB_OUTPUT is empty`	`mcp_handler_shell.test.cjs`	✅ Design	Edge case: no outputs
`should handle shell script errors with enhanced error message`	`mcp_handler_shell.test.cjs`	✅ Design	Error message content
`should clean up output file after successful execution`	`mcp_handler_shell.test.cjs`	✅ Design	Resource cleanup (success)
`should clean up output file after script error`	`mcp_handler_shell.test.cjs`	✅ Design	Resource cleanup (error)
`should respect timeout setting`	`mcp_handler_shell.test.cjs`	✅ Design	Timeout enforcement
`should execute script from GITHUB_WORKSPACE directory`	`mcp_handler_shell.test.cjs`	✅ Design	Env-driven cwd behavior
`should handle multiple outputs`	`mcp_handler_shell.test.cjs`	✅ Design	Multiple key=value outputs

Flagged Tests — Requires Review

No tests are flagged. All 44 tests verify behavioral contracts.

Inflation Note

The test inflation penalty (10 pts) was applied because two handler files show a >2:1 ratio of test lines added vs. production lines added:

Test File	Test Lines Added	Production Lines Added	Ratio
`mcp_handler_python.test.cjs`	174	11	15.8×
`mcp_handler_shell.test.cjs`	231	54	4.3×

Context: This is a refactoring PR that extracted shared execution logic into mcp_handler_process.cjs (155 lines). The python and shell handlers were simplified, so their added-line count is low. The new test files cover the full behavior of each handler — this is appropriate coverage for newly-extracted shared logic, not test bloat. The inflation flag is technically correct per the metric but the 10-pt deduction reflects the mechanical calculation, not a quality problem.

Language Support

Tests analyzed:

🐹 Go (*_test.go): 0 tests
🟨 JavaScript (*.test.cjs): 44 tests (vitest)

Verdict

✅ Check passed. 0% of new tests are implementation tests (threshold: 30%). All 44 tests verify observable behavioral contracts across process execution, error handling, input propagation, timeout enforcement, and resource cleanup.

📖 Understanding Test Classifications

Design Tests (High Value) verify what the system does:

Assert on observable outputs, return values, or state changes
Cover error paths and boundary conditions
Would catch a behavioral regression if deleted
Remain valid even after internal refactoring

Implementation Tests (Low Value) verify how the system does it:

Assert on internal function calls (mocking internals)
Only test the happy path with typical inputs
Break during legitimate refactoring even when behavior is correct
Give false assurance: they pass even when the system is wrong

Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.

References: §25210794262

🧪 Test quality analysis by Test Quality Sentinel · ● 463.5K · ◷

github-actions

✅ Test Quality Sentinel: 80/100. Test quality is excellent — 0% of new tests are implementation tests (threshold: 30%). All 44 tests verify behavioral contracts.

pelikhan · 2026-05-01T10:25:49Z

@copilot review all
Comments

…t hooks Agent-Logs-Url: https://github.com/github/gh-aw/sessions/7848f82c-7265-4814-9346-9565909c2241 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot · 2026-05-01T10:31:38Z

@copilot review all
Comments

Addressed all three review findings in df0e346:

buildEnhancedError accuracy — now distinguishes numeric exit codes (exit code: N), signal-killed processes (signal: SIGTERM), and OS-level failures like ENOENT (preserves original error.message verbatim). The old "fall back to 1" catch-all is gone.
onError safety — wrapped in try/catch; cleanup errors are logged via server.debugError and then suppressed so the original execution error is always what rejects the promise.
buildResult safety — wrapped in try/catch; falls back to { stdout, stderr } on failure so a broken custom result builder can't crash the process.

Initial plan

04cee4b

Copilot AI assigned Copilot and gh-aw-bot May 1, 2026

Copilot started work on behalf of gh-aw-bot May 1, 2026 09:52 View session

Copilot AI linked an issue May 1, 2026 that may be closed by this pull request

Duplicate Code: MCP script handler execution envelope #29496

Closed

5 tasks

refactor: extract shared MCP script handler execution envelope (#issue)

8871347

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/b0f10b19-1533-4c55-af8c-4575d5caa4e2 Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Refactor duplicate MCP script handler execution envelope~~ refactor: extract shared MCP script handler execution envelope May 1, 2026

Copilot finished work on behalf of gh-aw-bot May 1, 2026 10:06

Copilot AI requested a review from gh-aw-bot May 1, 2026 10:06

pelikhan marked this pull request as ready for review May 1, 2026 10:13

Copilot AI review requested due to automatic review settings May 1, 2026 10:13

Copilot started reviewing on behalf of pelikhan May 1, 2026 10:14 View session

Copilot AI reviewed May 1, 2026

View reviewed changes

github-actions Bot mentioned this pull request May 1, 2026

[aw] No-Op Runs #29134

Open

github-actions Bot approved these changes May 1, 2026

View reviewed changes

Copilot started work on behalf of pelikhan May 1, 2026 10:25 View session

fix: improve buildEnhancedError accuracy and guard onError/buildResul…

df0e346

…t hooks Agent-Logs-Url: https://github.com/github/gh-aw/sessions/7848f82c-7265-4814-9346-9565909c2241 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot finished work on behalf of pelikhan May 1, 2026 10:32

Copilot AI requested a review from pelikhan May 1, 2026 10:32

pelikhan merged commit 5050748 into main May 1, 2026

pelikhan deleted the copilot/refactor-mcp-handler-execution-envelope branch May 1, 2026 10:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: extract shared MCP script handler execution envelope#29497

refactor: extract shared MCP script handler execution envelope#29497
pelikhan merged 3 commits intomainfrom
copilot/refactor-mcp-handler-execution-envelope

Copilot AI commented May 1, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

github-actions Bot commented May 1, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

pelikhan commented May 1, 2026

Uh oh!

Copilot AI commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Copilot AI commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

github-actions Bot commented May 1, 2026

🧪 Test Quality Sentinel Report

Test Quality Score: 80/100

Test Classification Details

Flagged Tests — Requires Review

Inflation Note

Language Support

Verdict

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

pelikhan commented May 1, 2026

Uh oh!

Copilot AI commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Copilot AI commented May 1, 2026 •

edited

Loading