Skip to content

fix: use UTF-8 encoding for subprocess output on Windows#90

Merged
jrob5756 merged 2 commits intomicrosoft:mainfrom
taipoweredpm:fix/windows-unicode-encoding
Apr 17, 2026
Merged

fix: use UTF-8 encoding for subprocess output on Windows#90
jrob5756 merged 2 commits intomicrosoft:mainfrom
taipoweredpm:fix/windows-unicode-encoding

Conversation

@taipoweredpm
Copy link
Copy Markdown
Contributor

Summary

Fixes #89 — Unicode characters (emoji, em-dash, etc.) are garbled on Windows when passed between agents or posted via tools.

Root Cause

Python defaults to cp1252 (charmap) encoding on Windows for subprocess pipes and text=True mode. This mangles non-ASCII characters in script output and subprocess calls.

Changes

  • executor/script.py: Set PYTHONUTF8=1 in subprocess env so child Python processes use UTF-8
  • cli/update.py: Added encoding="utf-8" to subprocess.run()
  • mcp_auth.py: Added encoding="utf-8" to subprocess.run()

Testing

All 75 existing tests pass. Lint clean.

On Windows, Python defaults to cp1252 (charmap) encoding for subprocess
pipes and text=True mode. This causes Unicode characters (emoji, em-dash,
etc.) to be garbled when passed between agents or posted via tools.

Changes:
- script.py: Set PYTHONUTF8=1 in subprocess environment so child Python
  processes use UTF-8 instead of system default encoding
- update.py: Add encoding='utf-8' to subprocess.run call
- mcp_auth.py: Add encoding='utf-8' to subprocess.run call

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@taipoweredpm
Copy link
Copy Markdown
Contributor Author

@microsoft-github-policy-service agree company="Microsoft"

Copy link
Copy Markdown
Collaborator

@jrob5756 jrob5756 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small comment.

Comment on lines +95 to +99
# Always set PYTHONUTF8=1 so child Python processes use UTF-8 encoding
# instead of the system default (cp1252 on Windows), preventing garbled
# Unicode characters in script output.
base_env = {**os.environ, "PYTHONUTF8": "1"}
env = {**base_env, **agent.env} if agent.env else base_env
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

script.py fixes the child side (env var), while update.py and mcp_auth.py only fix the parent side (encoding= param). For Python-based child CLIs, both sides should be addressed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch - addressed in e17ac2f. Both update.py and mcp_auth.py now set PYTHONUTF8=1 in the subprocess env alongside the existing encoding="utf-8" param, matching the approach in script.py. All three call sites are now consistent with both parent-side and child-side encoding fixes.

Address review feedback: update.py and mcp_auth.py now set PYTHONUTF8=1
in the child process environment, matching the approach in script.py.
This ensures both parent-side (encoding='utf-8') and child-side
(PYTHONUTF8=1) encoding fixes are applied consistently across all
subprocess call sites.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@codecov-commenter
Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 85.71429% with 1 line in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@198aebc). Learn more about missing BASE report.

Files with missing lines Patch % Lines
src/conductor/mcp_auth.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main      #90   +/-   ##
=======================================
  Coverage        ?   85.42%           
=======================================
  Files           ?       46           
  Lines           ?     6441           
  Branches        ?        0           
=======================================
  Hits            ?     5502           
  Misses          ?      939           
  Partials        ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Collaborator

@jrob5756 jrob5756 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Thanks!

@jrob5756 jrob5756 merged commit 92dc2f6 into microsoft:main Apr 17, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Windows: Unicode characters in agent output are garbled (charmap encoding)

4 participants