Summary
Two confirmed architectural gaps in the SafeOutputs sanitization and threat-detection layers (v0.67.1). First, hardenUnicodeText in sanitize_content_core.cjs applies NFC normalization and strips zero-width/bidi controls and full-width ASCII (U+FF01–FF5E), but does not apply NFKC or a Unicode confusables table — Cyrillic and Greek lookalike characters pass through sanitizeContent unchanged and are submitted verbatim to the GitHub API. Second, the "Encoded Strings" check in threat_detection.md (Base64, hex, etc.) is nested as a sub-bullet under the "Malicious Patch" section; for create_issue-only workflows with no git patch, encoded strings in the issue body are not explicitly covered by either the static layer or a dedicated AI-detection instruction.
Affected Area
SafeOutputs content sanitization (hardenUnicodeText / sanitize_content_core.cjs) and AI threat-detection prompt (threat_detection.md) — the two defense-in-depth controls that mediate all write operations through the safe_outputs job.
Reproduction Outline
Gap 1 — Cyrillic homoglyph bypass (static layer)
- Substitute Latin characters in a sensitive string with visually identical Cyrillic codepoints (e.g., A→А U+0410, E→Е U+0415, T→Т U+0422, K→К U+041A, M→М U+041C, S→С U+0421, R→Р U+0420).
- Pass the substituted string as the body of a
create_issue safe-outputs call in any gh-aw workflow.
sanitizeContent is invoked by collect_ndjson_output.cjs; hardenUnicodeText runs NFC + full-width ASCII mapping — Cyrillic codepoints are unchanged.
- The body reaches the GitHub API verbatim.
Gap 2 — Encoded strings not flagged for create_issue-only workflows (AI detection layer)
- Use a
create_issue-only workflow (no git patch produced).
- Include a Base64- or hex-encoded representation of a sensitive string in the proposed issue body.
- The detection model evaluates agent output under "Prompt Injection" and "Secret Leak" only; the "Encoded Strings" sub-check is contextually tied to the "Malicious Patch" section and is not reliably applied to issue body content.
Observed Behavior
- Cyrillic-substituted content:
sanitizeContent returns the input unchanged (input === output: true; confirmed against deployed v0.67.1 scripts).
- Base64/hex-encoded content: passes through
sanitizeContent unchanged; detection AI does not have an explicit instruction to flag encoded strings in non-patch output.
Expected Behavior
hardenUnicodeText should normalize Cyrillic/Greek lookalike characters to their Latin equivalents (e.g., via result.normalize("NFKC") or a Unicode TR#39 confusables table) before content reaches the GitHub API.
threat_detection.md should include an explicit instruction under the "Secret Leak" check to flag Base64, hex, or homoglyph-substituted representations of secrets regardless of whether a code patch is present.
Security Relevance
The SafeOutputs write path is the primary control preventing an AI agent from leaking sensitive content to GitHub. A homoglyph-substituted or encoded secret that bypasses both the static sanitizer and the AI detection prompt can be exfiltrated into a public issue body while appearing to satisfy all pre-write checks. This contradicts the documented guarantee that "detection analysis explicitly examines outputs for secret leaks before safe output jobs proceed."
Suggested Fixes
- In
hardenUnicodeText (sanitize_content_core.cjs), add result = result.normalize("NFKC") after the existing NFC step, and/or add a Cyrillic→Latin / Greek→Latin confusables-map pass (Unicode TR#39).
- In
threat_detection.md, add an explicit bullet under "Secret Leak" instructing the model to flag Base64, hex, ROT13, or homoglyph-substituted representations of secrets — independent of whether a code patch is present.
- Consider extending
redact_secrets.cjs built-in patterns to scan create_issue body content as a static backstop independent of the AI detection result.
gh-aw version: v0.67.1
Original finding: https://github.com/githubnext/gh-aw-security/issues/1711
Generated by File Issue · ● 348.8K · ◷
Summary
Two confirmed architectural gaps in the SafeOutputs sanitization and threat-detection layers (v0.67.1). First,
hardenUnicodeTextinsanitize_content_core.cjsapplies NFC normalization and strips zero-width/bidi controls and full-width ASCII (U+FF01–FF5E), but does not apply NFKC or a Unicode confusables table — Cyrillic and Greek lookalike characters pass throughsanitizeContentunchanged and are submitted verbatim to the GitHub API. Second, the"Encoded Strings"check inthreat_detection.md(Base64, hex, etc.) is nested as a sub-bullet under the"Malicious Patch"section; forcreate_issue-only workflows with no git patch, encoded strings in the issue body are not explicitly covered by either the static layer or a dedicated AI-detection instruction.Affected Area
SafeOutputs content sanitization (
hardenUnicodeText/sanitize_content_core.cjs) and AI threat-detection prompt (threat_detection.md) — the two defense-in-depth controls that mediate all write operations through thesafe_outputsjob.Reproduction Outline
Gap 1 — Cyrillic homoglyph bypass (static layer)
create_issuesafe-outputs call in any gh-aw workflow.sanitizeContentis invoked bycollect_ndjson_output.cjs;hardenUnicodeTextruns NFC + full-width ASCII mapping — Cyrillic codepoints are unchanged.Gap 2 — Encoded strings not flagged for create_issue-only workflows (AI detection layer)
create_issue-only workflow (no git patch produced).Observed Behavior
sanitizeContentreturns the input unchanged (input === output: true; confirmed against deployed v0.67.1 scripts).sanitizeContentunchanged; detection AI does not have an explicit instruction to flag encoded strings in non-patch output.Expected Behavior
hardenUnicodeTextshould normalize Cyrillic/Greek lookalike characters to their Latin equivalents (e.g., viaresult.normalize("NFKC")or a Unicode TR#39 confusables table) before content reaches the GitHub API.threat_detection.mdshould include an explicit instruction under the "Secret Leak" check to flag Base64, hex, or homoglyph-substituted representations of secrets regardless of whether a code patch is present.Security Relevance
The SafeOutputs write path is the primary control preventing an AI agent from leaking sensitive content to GitHub. A homoglyph-substituted or encoded secret that bypasses both the static sanitizer and the AI detection prompt can be exfiltrated into a public issue body while appearing to satisfy all pre-write checks. This contradicts the documented guarantee that "detection analysis explicitly examines outputs for secret leaks before safe output jobs proceed."
Suggested Fixes
hardenUnicodeText(sanitize_content_core.cjs), addresult = result.normalize("NFKC")after the existing NFC step, and/or add a Cyrillic→Latin / Greek→Latin confusables-map pass (Unicode TR#39).threat_detection.md, add an explicit bullet under "Secret Leak" instructing the model to flag Base64, hex, ROT13, or homoglyph-substituted representations of secrets — independent of whether a code patch is present.redact_secrets.cjsbuilt-in patterns to scancreate_issuebody content as a static backstop independent of the AI detection result.gh-aw version: v0.67.1
Original finding: https://github.com/githubnext/gh-aw-security/issues/1711