Skip to content

SafeOutputs: hardenUnicodeText lacks Cyrillic/Greek homoglyph normalization; threat_detection.md "Encoded Strings" check not [Content truncated due to length] #25457

@szabta89

Description

@szabta89

Summary

Two confirmed architectural gaps in the SafeOutputs sanitization and threat-detection layers (v0.67.1). First, hardenUnicodeText in sanitize_content_core.cjs applies NFC normalization and strips zero-width/bidi controls and full-width ASCII (U+FF01–FF5E), but does not apply NFKC or a Unicode confusables table — Cyrillic and Greek lookalike characters pass through sanitizeContent unchanged and are submitted verbatim to the GitHub API. Second, the "Encoded Strings" check in threat_detection.md (Base64, hex, etc.) is nested as a sub-bullet under the "Malicious Patch" section; for create_issue-only workflows with no git patch, encoded strings in the issue body are not explicitly covered by either the static layer or a dedicated AI-detection instruction.

Affected Area

SafeOutputs content sanitization (hardenUnicodeText / sanitize_content_core.cjs) and AI threat-detection prompt (threat_detection.md) — the two defense-in-depth controls that mediate all write operations through the safe_outputs job.

Reproduction Outline

Gap 1 — Cyrillic homoglyph bypass (static layer)

  1. Substitute Latin characters in a sensitive string with visually identical Cyrillic codepoints (e.g., A→А U+0410, E→Е U+0415, T→Т U+0422, K→К U+041A, M→М U+041C, S→С U+0421, R→Р U+0420).
  2. Pass the substituted string as the body of a create_issue safe-outputs call in any gh-aw workflow.
  3. sanitizeContent is invoked by collect_ndjson_output.cjs; hardenUnicodeText runs NFC + full-width ASCII mapping — Cyrillic codepoints are unchanged.
  4. The body reaches the GitHub API verbatim.

Gap 2 — Encoded strings not flagged for create_issue-only workflows (AI detection layer)

  1. Use a create_issue-only workflow (no git patch produced).
  2. Include a Base64- or hex-encoded representation of a sensitive string in the proposed issue body.
  3. The detection model evaluates agent output under "Prompt Injection" and "Secret Leak" only; the "Encoded Strings" sub-check is contextually tied to the "Malicious Patch" section and is not reliably applied to issue body content.

Observed Behavior

  • Cyrillic-substituted content: sanitizeContent returns the input unchanged (input === output: true; confirmed against deployed v0.67.1 scripts).
  • Base64/hex-encoded content: passes through sanitizeContent unchanged; detection AI does not have an explicit instruction to flag encoded strings in non-patch output.

Expected Behavior

  • hardenUnicodeText should normalize Cyrillic/Greek lookalike characters to their Latin equivalents (e.g., via result.normalize("NFKC") or a Unicode TR#39 confusables table) before content reaches the GitHub API.
  • threat_detection.md should include an explicit instruction under the "Secret Leak" check to flag Base64, hex, or homoglyph-substituted representations of secrets regardless of whether a code patch is present.

Security Relevance

The SafeOutputs write path is the primary control preventing an AI agent from leaking sensitive content to GitHub. A homoglyph-substituted or encoded secret that bypasses both the static sanitizer and the AI detection prompt can be exfiltrated into a public issue body while appearing to satisfy all pre-write checks. This contradicts the documented guarantee that "detection analysis explicitly examines outputs for secret leaks before safe output jobs proceed."

Suggested Fixes

  1. In hardenUnicodeText (sanitize_content_core.cjs), add result = result.normalize("NFKC") after the existing NFC step, and/or add a Cyrillic→Latin / Greek→Latin confusables-map pass (Unicode TR#39).
  2. In threat_detection.md, add an explicit bullet under "Secret Leak" instructing the model to flag Base64, hex, ROT13, or homoglyph-substituted representations of secrets — independent of whether a code patch is present.
  3. Consider extending redact_secrets.cjs built-in patterns to scan create_issue body content as a static backstop independent of the AI detection result.

gh-aw version: v0.67.1

Original finding: https://github.com/githubnext/gh-aw-security/issues/1711

Generated by File Issue · ● 348.8K ·

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions