fix: use UTF-8 encoding for subprocess output on Windows#90
fix: use UTF-8 encoding for subprocess output on Windows#90jrob5756 merged 2 commits intomicrosoft:mainfrom
Conversation
On Windows, Python defaults to cp1252 (charmap) encoding for subprocess pipes and text=True mode. This causes Unicode characters (emoji, em-dash, etc.) to be garbled when passed between agents or posted via tools. Changes: - script.py: Set PYTHONUTF8=1 in subprocess environment so child Python processes use UTF-8 instead of system default encoding - update.py: Add encoding='utf-8' to subprocess.run call - mcp_auth.py: Add encoding='utf-8' to subprocess.run call Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
@microsoft-github-policy-service agree company="Microsoft" |
| # Always set PYTHONUTF8=1 so child Python processes use UTF-8 encoding | ||
| # instead of the system default (cp1252 on Windows), preventing garbled | ||
| # Unicode characters in script output. | ||
| base_env = {**os.environ, "PYTHONUTF8": "1"} | ||
| env = {**base_env, **agent.env} if agent.env else base_env |
There was a problem hiding this comment.
script.py fixes the child side (env var), while update.py and mcp_auth.py only fix the parent side (encoding= param). For Python-based child CLIs, both sides should be addressed.
There was a problem hiding this comment.
Good catch - addressed in e17ac2f. Both update.py and mcp_auth.py now set PYTHONUTF8=1 in the subprocess env alongside the existing encoding="utf-8" param, matching the approach in script.py. All three call sites are now consistent with both parent-side and child-side encoding fixes.
Address review feedback: update.py and mcp_auth.py now set PYTHONUTF8=1 in the child process environment, matching the approach in script.py. This ensures both parent-side (encoding='utf-8') and child-side (PYTHONUTF8=1) encoding fixes are applied consistently across all subprocess call sites. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #90 +/- ##
=======================================
Coverage ? 85.42%
=======================================
Files ? 46
Lines ? 6441
Branches ? 0
=======================================
Hits ? 5502
Misses ? 939
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Summary
Fixes #89 — Unicode characters (emoji, em-dash, etc.) are garbled on Windows when passed between agents or posted via tools.
Root Cause
Python defaults to cp1252 (charmap) encoding on Windows for subprocess pipes and text=True mode. This mangles non-ASCII characters in script output and subprocess calls.
Changes
Testing
All 75 existing tests pass. Lint clean.