fix(scan): stop cross-scan leak on single-file scans of configs by jonathansantilli · Pull Request #54 · jonathansantilli/codegate

jonathansantilli · 2026-04-22T16:45:29Z

Summary

Closes the remaining cross-scan leak left open by #53. Scanning a single file under $HOME (e.g. ~/.claude/settings.json) still surfaced findings from unrelated siblings like ~/.agents/skills/*/SKILL.md.

Reproducer (0.14.2)

$ npx codegate-ai@0.14.2 scan ~/.claude/settings.json --format json
scan_target: /Users/me/.claude/settings.json
findings:
  - HIGH rule-file-hidden-unicode
    file_path: ~/.agents/skills/api-design-guide/domains/rest/SKILL.md

The finding doesn't belong to that scan.

Root cause

The CLI stages single-file targets via stageLocalFile, which copies them into a temp dir (outside $HOME). The scan engine then runs against the staged temp dir. In shouldKeepUserScopeCandidate:

scanTarget = /tmp/codegate-scan-target-xxx
homeDir = /Users/me
isPathInside(homeDir, scanTarget) = false
→ function returns true for every candidate
→ the user-scope walk of $HOME leaks every hidden-unicode match back into the scan

#53 scoped its fix to scan targets inside homeDir. Staged file targets aren't inside homeDir, so they skipped the fix.

Fix

CLI layer (primary): when resolvedTarget.explicitCandidates is non-empty (i.e. the raw target was a local file, now staged), force scan_user_scope = false for that scan. Explicit opt-in via --include-user-scope still overrides. This matches user expectation: "scan this file" ≠ "scan my whole home."

Engine layer (defence in depth): shouldKeepUserScopeCandidate now also handles engine-level file targets. If the target is a file inside homeDir, only the file itself is a valid user-scope candidate. Library callers bypassing the CLI get the same guarantee.

Tests

tests/layer2/cross-scan-attribution.test.ts:

Existing 3 cases from fix(scan): stop attributing host-wide findings to per-target scans #53 still pass.
New: engine-level file-target scan drops sibling user-scope candidates.

Checks

npm run typecheck ✅
npm run lint ✅
npm test — 154 files / 720 tests (+1 from fix(scan): stop attributing host-wide findings to per-target scans #53)
npx prettier --check ✅

Key files

src/cli.ts — CLI-layer fix
src/scan.ts — engine-layer fix (shouldKeepUserScopeCandidate)
tests/layer2/cross-scan-attribution.test.ts — new engine-level file-target test

PR #53 closed the cross-scan leak for skill-dir scans but not for single-file scans of configs like ~/.claude/settings.json. Symptom: $ codegate-ai scan ~/.claude/settings.json → finding with file_path=~/.agents/skills/api-design-guide/.../SKILL.md Root cause: the CLI stages single-file targets into a temp dir outside $HOME. The staged dir is not inside homeDir, so shouldKeepUserScopeCandidate short-circuits to `return true` and every sibling user-scope match (e.g. a hidden-unicode hit in a completely unrelated skill) gets attributed to the config scan. Fix: - cli.ts: when resolvedTarget.explicitCandidates is non-empty (the target was a staged local file), force scan_user_scope=false for that scan. Explicit opt-in via --include-user-scope still overrides. This matches user expectation: "scan this file" ≠ "scan my whole home." - scan.ts: shouldKeepUserScopeCandidate now also handles engine-level file targets correctly (if the target is a file inside homeDir, only the target file itself is a valid user-scope candidate). This is defence in depth for library callers that bypass the CLI. Tests: - Existing 3 cases in tests/layer2/cross-scan-attribution.test.ts still pass. - New: engine-level file-target scan drops sibling user-scope candidates. Verified 154 test files / 720 tests pass. Lint + prettier clean.

## [0.14.3](v0.14.2...v0.14.3) (2026-04-22) ### Bug Fixes * **scan:** disable user-scope walk when CLI scans a single file ([#54](#54)) ([6799651](6799651))

#55) PR #54 disabled the user-scope walk when the CLI scanned a single local file, by gating on `explicitCandidates.length > 0`. That gate breaks for files whose extension is not in `inferTextLikeFormat` — e.g. `.idea/workspace.xml`, `.env` with unusual names, binary-ish configs — because `collectExplicitCandidates` returns `[]` for them, the guard never fires, and sibling user-scope findings (e.g. a hidden-unicode hit in `~/.agents/skills/foo/SKILL.md`) leak into the scan of the unrelated file. Reproducer (0.14.3): $ npx codegate-ai scan ~/workspace/.idea/workspace.xml --format json scan_target: .../.idea/workspace.xml findings: - HIGH rule-file-hidden-unicode file_path: ~/.agents/skills/api-design-guide/domains/rest/SKILL.md ## Fix Add a `stagedFromLocalFile: true` flag to `ResolvedScanTarget`, set from `stageLocalFile`. The CLI gate now uses this flag directly: scan_user_scope = --include-user-scope ? true : stagedFromLocalFile ? false : baseConfig.scan_user_scope It's a signal that doesn't depend on whether the file's extension was recognisable. Covers every file type, no per-format maintenance. ## Tests `tests/scan-target.test.ts`: - Staged `.xml` file gets `stagedFromLocalFile=true` AND empty `explicitCandidates` (the PR #54 gate would have failed here). - Staged `.json` file also gets `stagedFromLocalFile=true` and populated `explicitCandidates` — no regression on the happy path. All 155 files / 722 tests pass. Lint + prettier + typecheck clean.

## [0.14.4](v0.14.3...v0.14.4) (2026-04-22) ### Bug Fixes * **scan:** close cross-scan leak for single-file targets of any format ([#55](#55)) ([46e2148](46e2148)), closes [#54](#54) [#54](#54)

jonathansantilli merged commit 6799651 into main Apr 22, 2026
16 checks passed

jonathansantilli deleted the fix/cross-scan-file-target-attribution branch April 22, 2026 17:03

github-actions Bot pushed a commit that referenced this pull request Apr 22, 2026

chore(release): 0.14.3 [skip ci]

7359778

## [0.14.3](v0.14.2...v0.14.3) (2026-04-22) ### Bug Fixes * **scan:** disable user-scope walk when CLI scans a single file ([#54](#54)) ([6799651](6799651))

jonathansantilli mentioned this pull request Apr 22, 2026

fix(scan): close cross-scan leak for single-file targets of any format #55

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(scan): stop cross-scan leak on single-file scans of configs#54

fix(scan): stop cross-scan leak on single-file scans of configs#54
jonathansantilli merged 1 commit intomainfrom
fix/cross-scan-file-target-attribution

jonathansantilli commented Apr 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jonathansantilli commented Apr 22, 2026

Summary

Reproducer (0.14.2)

Root cause

Fix

Tests

Checks

Key files

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant