Skip to content

fix(scan): close cross-scan leak for single-file targets of any format#55

Merged
jonathansantilli merged 1 commit intomainfrom
fix/cross-scan-all-file-types
Apr 22, 2026
Merged

fix(scan): close cross-scan leak for single-file targets of any format#55
jonathansantilli merged 1 commit intomainfrom
fix/cross-scan-all-file-types

Conversation

@jonathansantilli
Copy link
Copy Markdown
Owner

Summary

Closes a leftover edge in #54: user-scope findings still leak into scans of single-file targets whose extension isn't in inferTextLikeFormat (e.g. .xml). The gate introduced in #54 relied on explicitCandidates.length > 0, which is empty for those files.

Reproducer on 0.14.3

```
$ npx codegate-ai scan ~/.idea/workspace.xml --format json
findings:

  • HIGH rule-file-hidden-unicode
    file_path: ~/.agents/skills/api-design-guide/domains/rest/SKILL.md
    ```

The XML file has nothing to do with the skill; the finding leaks in because the user-scope walker ran unfiltered.

Fix

stageLocalFile now sets stagedFromLocalFile: true on ResolvedScanTarget. The CLI gate in cli.ts uses that flag directly instead of explicitCandidates.length > 0. Reliable for any file type.

Tests

New cases in tests/scan-target.test.ts:

  • stageLocalFile on a .xml produces stagedFromLocalFile=true with an empty explicitCandidates (reproduces the class of bug).
  • stageLocalFile on a .json produces stagedFromLocalFile=true with populated candidates (no regression).

All 155 files / 722 tests pass. `npm run typecheck`, `npm run lint`, `npx prettier --check` clean.

PR #54 disabled the user-scope walk when the CLI scanned a single local
file, by gating on `explicitCandidates.length > 0`. That gate breaks
for files whose extension is not in `inferTextLikeFormat` — e.g.
`.idea/workspace.xml`, `.env` with unusual names, binary-ish configs —
because `collectExplicitCandidates` returns `[]` for them, the guard
never fires, and sibling user-scope findings (e.g. a hidden-unicode hit
in `~/.agents/skills/foo/SKILL.md`) leak into the scan of the
unrelated file.

Reproducer (0.14.3):

  $ npx codegate-ai scan ~/workspace/.idea/workspace.xml --format json
  scan_target: .../.idea/workspace.xml
  findings:
    - HIGH rule-file-hidden-unicode
      file_path: ~/.agents/skills/api-design-guide/domains/rest/SKILL.md

## Fix

Add a `stagedFromLocalFile: true` flag to `ResolvedScanTarget`, set
from `stageLocalFile`. The CLI gate now uses this flag directly:

  scan_user_scope =
    --include-user-scope ? true
    : stagedFromLocalFile ? false
    : baseConfig.scan_user_scope

It's a signal that doesn't depend on whether the file's extension was
recognisable. Covers every file type, no per-format maintenance.

## Tests

`tests/scan-target.test.ts`:
- Staged `.xml` file gets `stagedFromLocalFile=true` AND empty
  `explicitCandidates` (the PR #54 gate would have failed here).
- Staged `.json` file also gets `stagedFromLocalFile=true` and
  populated `explicitCandidates` — no regression on the happy path.

All 155 files / 722 tests pass. Lint + prettier + typecheck clean.
@jonathansantilli jonathansantilli merged commit 46e2148 into main Apr 22, 2026
16 checks passed
@jonathansantilli jonathansantilli deleted the fix/cross-scan-all-file-types branch April 22, 2026 21:26
github-actions Bot pushed a commit that referenced this pull request Apr 22, 2026
## [0.14.4](v0.14.3...v0.14.4) (2026-04-22)

### Bug Fixes

* **scan:** close cross-scan leak for single-file targets of any format ([#55](#55)) ([46e2148](46e2148)), closes [#54](#54) [#54](#54)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant