fix(scan): stop attributing host-wide findings to per-target scans#53
Merged
jonathansantilli merged 1 commit intomainfrom Apr 22, 2026
Merged
Conversation
A targeted `scan <target>` currently surfaces two classes of finding whose `file_path` is outside the scan target. When the Mobb Agent Security dashboard groups scans by target, those host-wide findings make every skill look suspicious even though only one is. This fixes both leaks. Layer 2 — hidden-unicode (and every other rule-file detector): when the scan target is itself inside the user's home directory (for example a single skill at `~/.codex/skills/<name>`), user-scope wildcard patterns like `~/.agents/skills/*/SKILL.md` were walking the whole home tree and picking up sibling skills. Add a `shouldKeepUserScopeCandidate` guard that drops user-scope matches outside the scan target in exactly this case; project-root scans outside the home directory keep the existing host-wide context behavior unchanged. Layer 3 — `layer3OutcomesToFindings`: a successful outcome with no `metadata.findings[]` / `metadata.tools[]` was emitting a LOW `layer3-network_error` PARSE_ERROR finding whose `file_path` was the remote MCP URL (e.g. `https://mcp.linear.app/mcp`). The default resource executor deliberately makes no outbound calls, so this "schema mismatch" was the norm, not an anomaly, and the resulting finding polluted every per-target scan report with host-level noise. Treat all resource kinds the same way registry resources were already handled: silently skip when no actionable metadata is present. Tests added: - `tests/layer2/cross-scan-attribution.test.ts` — sibling skill inside the same home is no longer attributed to the scanned skill, but is reported when the scan target is their parent; project scans outside the home still see user-scope files. - `tests/layer3/network-error-suppression.test.ts` — HTTP/SSE schema mismatches produce no findings while timeout / auth_failure / command_error / skipped_without_consent still do. `tests/layer3/layer3-integration.test.ts` is updated to match: the case that asserted the old schema-mismatch finding now asserts its absence.
github-actions Bot
pushed a commit
that referenced
this pull request
Apr 22, 2026
## [0.14.2](v0.14.1...v0.14.2) (2026-04-22) ### Bug Fixes * **scan:** stop attributing host-wide findings to per-target scans ([#53](#53)) ([77f9627](77f9627))
jonathansantilli
added a commit
that referenced
this pull request
Apr 22, 2026
PR #53 closed the cross-scan leak for skill-dir scans but not for single-file scans of configs like ~/.claude/settings.json. Symptom: $ codegate-ai scan ~/.claude/settings.json → finding with file_path=~/.agents/skills/api-design-guide/.../SKILL.md Root cause: the CLI stages single-file targets into a temp dir outside $HOME. The staged dir is not inside homeDir, so shouldKeepUserScopeCandidate short-circuits to `return true` and every sibling user-scope match (e.g. a hidden-unicode hit in a completely unrelated skill) gets attributed to the config scan. Fix: - cli.ts: when resolvedTarget.explicitCandidates is non-empty (the target was a staged local file), force scan_user_scope=false for that scan. Explicit opt-in via --include-user-scope still overrides. This matches user expectation: "scan this file" ≠ "scan my whole home." - scan.ts: shouldKeepUserScopeCandidate now also handles engine-level file targets correctly (if the target is a file inside homeDir, only the target file itself is a valid user-scope candidate). This is defence in depth for library callers that bypass the CLI. Tests: - Existing 3 cases in tests/layer2/cross-scan-attribution.test.ts still pass. - New: engine-level file-target scan drops sibling user-scope candidates. Verified 154 test files / 720 tests pass. Lint + prettier clean.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Targeted
scan <target>reports were surfacing two classes of finding whosefile_pathlives outside the scan target. When the Mobb Agent Security dashboard groups scans by target, those host-wide findings make every skill look suspicious even though only one is. This PR fixes both leaks and adds regression tests.Bug 1 — L2
rule-file-hidden-unicode(and siblings) scan other skillsRoot cause. User-scope wildcard patterns in the KB (e.g.
~/.agents/skills/*/SKILL.md) walk the entire home directory. When the scan target is itself inside$HOME(for example a single skill at~/.codex/skills/<name>), those wildcards match sibling skills that belong to completely different scans. The hidden-unicode (and every other rule-file) finding for a sibling was attributed to the current scan.Fix. In
src/scan.ts, gate user-scope candidate acceptance with a newshouldKeepUserScopeCandidatehelper:~/.cursor/mcp.jsonremain legitimate context.Bug 2 — L3
layer3-network_errorfor every host-configured MCP endpointRoot cause.
layer3OutcomesToFindingsemitted a LOWlayer3-network_error(categoryPARSE_ERROR) finding whosefile_pathwas the remote URL — e.g.https://mcp.linear.app/mcp— whenever a successful deep-scan outcome carried nometadata.findings[]ormetadata.tools[]. The defaultexecuteDeepResourceinsrc/cli.tsdeliberately makes no outbound calls, so this "schema mismatch" was the norm, not an anomaly. Every scan with deep enabled therefore leaked a host-level finding onto the per-target report.Fix. Apply the same treatment registry resources already had: silently skip any "ok" outcome whose metadata yields no findings/tools. Genuine fetch-level failures (
timeout,auth_failure,command_error,skipped_without_consent) still produce findings.Tests
New:
tests/layer2/cross-scan-attribution.test.ts— sibling skill inside the same home is no longer attributed to the scanned skill; parent-scope scan still surfaces both skills; project scans outside the home keep user-scope attribution.tests/layer3/network-error-suppression.test.ts— HTTP/SSE schema mismatches produce no findings while timeout / auth_failure / command_error / skipped_without_consent still do, and actionabletools[]findings are preserved.Updated (old behavior was encoded as an expectation, same treatment as PR #52):
tests/layer3/layer3-integration.test.ts— the case that asserted the old schema-mismatch finding now asserts its absence.Totals: 712 → 719 tests (+7), 152 → 154 files (+2).
Test plan
npm run typechecknpm run lintnpx prettier --check .npm test(154 files, 719 tests, all passing)tests/layer2/cross-scan-attribution.test.ts,tests/layer3/network-error-suppression.test.ts)CHANGELOG.mdorpackage.json— release tooling handles those