fix(security): preserve leading BOM in strip_dangerous by dadavidtseng · Pull Request #372 · microsoft/apm

dadavidtseng · 2026-03-19T16:42:18Z

Summary

ContentScanner.strip_dangerous() was stripping all BOM characters (U+FEFF), including a leading BOM at position 0. This contradicts its documented contract:

Info-level characters (emoji selectors, non-breaking spaces, unusual whitespace) are preserved — they are legitimate and stripping them would break content.

scan_text() correctly classifies a leading BOM as "info" severity (standard practice for UTF-8 files), but strip_dangerous() unconditionally stripped it anyway. When apm audit --strip runs on a file with a legitimate leading BOM, the BOM would be incorrectly removed.

Fix

Check whether the BOM is at position 0 before deciding to strip it
Leading BOM (info-level) is now preserved; mid-file BOMs (warning-level) are still stripped
Updated the corresponding test to assert the leading BOM is preserved

Files changed

src/apm_cli/security/content_scanner.py — 3-line fix in strip_dangerous()
tests/unit/test_content_scanner.py — updated test to match corrected behavior

Test plan

All 78 content scanner tests pass
All 41 audit command tests pass
Leading BOM at position 0 is preserved (info-level)
Mid-file BOM is still stripped (warning-level)

strip_dangerous() was stripping all BOM characters (U+FEFF), including a leading BOM at position 0. This contradicts its documented contract which states that info-level characters are preserved — and scan_text() classifies a leading BOM as info severity since it is standard practice for UTF-8 files. The fix checks whether the BOM is at position 0 before deciding to strip it. Mid-file BOMs (warning-level) are still stripped as before. Updated the corresponding test to assert the leading BOM is preserved.

dadavidtseng · 2026-03-19T16:44:50Z

@microsoft-github-policy-service agree

Copilot

Pull request overview

This PR fixes ContentScanner.strip_dangerous() to preserve a legitimate leading BOM (U+FEFF) while still stripping mid-file BOMs, aligning behavior with the documented “info-level characters are preserved” contract.

Changes:

Update strip_dangerous() to keep BOM at index 0 and remove BOMs elsewhere.
Update the unit test to assert a leading BOM remains unchanged.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
`src/apm_cli/security/content_scanner.py`	Adjust BOM handling in `strip_dangerous()` to preserve the leading BOM only.
`tests/unit/test_content_scanner.py`	Update test to validate preservation of a leading BOM.

src/apm_cli/security/content_scanner.py

Restructure the BOM handling branch so the mid-file strip path uses an early continue and the leading-BOM path falls through to the common append, making the two behaviors self-documenting. Co-authored-by: Copilot <copilot@github.com>

danielmeppiel

Clean, well-scoped fix. strip_dangerous() now correctly preserves leading BOM (info-level) while still stripping mid-file BOM (warning-level), aligning with the documented security model contract.

Please ensure CI tests pass before merging. Thanks for the contribution! 🎉

dadavidtseng · 2026-03-20T09:17:26Z

@danielmeppiel All CI checks have passed, but the Integration Tests (PR) required check is stuck at "Waiting for status to be reported" and was never triggered. Could you re-run it or advise on how to trigger it? Thanks!

danielmeppiel · 2026-03-20T09:22:24Z

It's a workflow gated on approval - it's running now, will merge. Thanks a lot for your contribution!

dadavidtseng requested a review from danielmeppiel as a code owner March 19, 2026 16:42

Copilot AI review requested due to automatic review settings March 19, 2026 16:42

Copilot AI reviewed Mar 19, 2026

View reviewed changes

src/apm_cli/security/content_scanner.py Outdated Show resolved Hide resolved

Copilot started reviewing on behalf of dadavidtseng March 19, 2026 18:06 View session

danielmeppiel approved these changes Mar 20, 2026

View reviewed changes

Merge branch 'main' into fix/preserve-leading-bom-in-strip-dangerous

5ad5a86

danielmeppiel merged commit 862c280 into microsoft:main Mar 20, 2026
7 checks passed

danielmeppiel mentioned this pull request Mar 20, 2026

chore: release v0.8.3 #386

Merged

github-actions bot mentioned this pull request Mar 21, 2026

[ca] Update APM (Agent Package Manager) to v0.8.3 github/gh-aw#22118

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(security): preserve leading BOM in strip_dangerous#372

fix(security): preserve leading BOM in strip_dangerous#372
danielmeppiel merged 3 commits intomicrosoft:mainfrom
dadavidtseng:fix/preserve-leading-bom-in-strip-dangerous

dadavidtseng commented Mar 19, 2026

Uh oh!

dadavidtseng commented Mar 19, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

danielmeppiel left a comment

Uh oh!

dadavidtseng commented Mar 20, 2026

Uh oh!

danielmeppiel commented Mar 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dadavidtseng commented Mar 19, 2026

Summary

Fix

Files changed

Test plan

Uh oh!

dadavidtseng commented Mar 19, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

danielmeppiel left a comment

Choose a reason for hiding this comment

Uh oh!

dadavidtseng commented Mar 20, 2026

Uh oh!

danielmeppiel commented Mar 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants