feat(allowedsymbols): add structural AST rules for Scanner.Buffer and OpenFile.Close by thieman · Pull Request #162 · DataDog/rshell

thieman · 2026-03-27T17:26:57Z

Summary

Adds two structural AST rules enforced on every file checked by checkAllowedSymbols, plus as standalone go/analysis analyzers (ScannerBufferAnalyzer, OpenFileCloseAnalyzer)
Rule 1: every bufio.NewScanner result must have .Buffer() called in the same function scope — prevents silent truncation on lines > 64 KiB
Rule 2: every .OpenFile(...) result must be closed via defer or explicit .Close() — prevents fd exhaustion over repeated script executions; understands hand-off (rc = f; defer rc.Close()) and return-ownership-transfer (return f, nil) patterns
Fixes two genuine violations caught by the new rules in builtins/internal/procinfo/procinfo_linux.go
Documents both rules in allowedsymbols/README.md §Structural Rules

Test plan

go test ./allowedsymbols/ — all unit tests pass including new TestScannerBuffer* and TestOpenFileClose* tests
go test ./interp/... — no regressions
Verify false-positive cases: hand-off pattern, return ownership transfer
CI green

🤖 Generated with Claude Code

… OpenFile.Close Two new structural rules enforced on every file that passes through checkAllowedSymbols, plus as standalone go/analysis analyzers: Rule 1 (ScannerBufferAnalyzer): every bufio.NewScanner result must have .Buffer() called in the same function scope. Without it the scanner uses a 64 KiB fixed buffer and silently fails on longer lines, a reliability and DoS risk for builtins handling arbitrary input. Rule 2 (OpenFileCloseAnalyzer): every callCtx.OpenFile result must be closed via defer or explicit Close() in the same function scope. The checker understands hand-off assignments (rc = f; defer rc.Close()) and return ownership transfer (return f, nil) to avoid false positives. Also fixes two genuine violations caught by the new rules: - builtins/internal/procinfo/procinfo_linux.go: add .Buffer() to two scanners reading /proc files. Documents both rules in allowedsymbols/README.md §Structural Rules with compliant and violation examples. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Renames the package directory, Go package name, and GitHub Actions workflow from "allowedsymbols" to "analysis" to better reflect that it now enforces both symbol-level import restrictions and structural code patterns (not just allowed symbols). - allowedsymbols/ → analysis/ (package analysis) - .github/workflows/allowed-symbols.yml → analysis.yml - Analyzer Name field: "allowedsymbols" → "analysis" - All internal references updated throughout The GitHub label "verified/allowed_symbols" is unchanged — it is a GitHub-side configuration that cannot be renamed via code. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… label Adds the new verified/analysis label as the canonical gate label, while keeping verified/allowed_symbols accepted for backwards compatibility with in-flight PRs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

thieman

Security-Focused Code Review — PR #162

Reviewer == PR author; using COMMENT event per review policy.

This PR adds two structural AST rules (checkFileScannerBuffer, checkFileOpenFileClose) to the analysis package and fixes two genuine violations in builtins/internal/procinfo/procinfo_linux.go. The rename of allowedsymbols/ → analysis/ is clean and mechanical. The structural checker concept is sound and fills a real gap. The findings below are about correctness holes in the checker logic that can produce false negatives.

Summary of Findings

Sev	Area	Finding
P1	Correctness / Security	`returned` map: false negative when `f` is returned in one branch but leaked in another
P2	Correctness	`isClosedTransitive` only one hop deep — chained hand-offs are missed
P2	Correctness	Variable name reuse (same name assigned twice via `:=` or `=`) silently satisfies the earlier close check
P2	Integration	`NewAnalyzer` (standalone `go/analysis` path) does NOT invoke the structural rules
P3	Test Coverage	No test for multi-branch return (returned in one branch, leaked in another)
P3	Test Coverage	No test for chained hand-off (`a=f; b=a; defer b.Close()`)
P3	Test Coverage	No test for variable name reuse (`f` reassigned via second `:=`)
P3	Test Coverage	`collectViolations` silently drops format args — only violation counts are checkable

The `procinfo_linux.go` fixes

Both .Buffer() additions are correct: they cap the scanner at 1 MiB (matching the project-wide convention) and are placed immediately after bufio.NewScanner(...), before the first Scan() call. No issues.

`forEachFuncBody` + `inspectBody` correctness

forEachFuncBody correctly enumerates every scope: ast.Inspect visits FuncDecl and FuncLit nodes and calls fn for each independently. inspectBody stops at nested FuncLit boundaries (return false), so inner scopes do not bleed into the outer check. Each scope is visited exactly once. The design is correct.

`NewAnalyzer` gap (P2 — discussed in review body due to few changed lines in that file)

NewAnalyzer in analyzer.go returns a standalone go/analysis.Analyzer for use with go vet/gopls. Its run closure calls only checkFileImports and checkFileSelectors — it does not call checkFileScannerBuffer or checkFileOpenFileClose. Structural rules are therefore absent when NewAnalyzer is used as a standalone analyzer. The ScannerBufferAnalyzer and OpenFileCloseAnalyzer vars are exported but have no registration point in this repo (no cmd/, no CI step invoking them). Currently the test-harness path in checkAllowedSymbols is the only place structural rules run in CI, which is sufficient for now — but this should be documented in the NewAnalyzer godoc so future integrators know they must register both structural analyzers separately.

…ctural-rules

P2: make isClosedTransitive truly transitive via BFS walk so that multi-hop hand-off chains (f → a → b → Close()) are correctly detected instead of only checking one level deep. P1/P2: document path-insensitivity limitations with comments in structural.go and dedicated Known-Limitation tests in structural_test.go: - returned-map false negative: a variable returned on any branch exempts it globally (path-sensitive CFG analysis would be needed to fix) - variable name reuse false negative: re-using the same name for two successive OpenFile results can cause one leak to go undetected P2: update NewAnalyzer godoc to note that ScannerBufferAnalyzer and OpenFileCloseAnalyzer must also be registered for full coverage. P3: remove dead collectViolations helper (silently dropped format args); add TestOpenFileCloseChainedHandOff to exercise the now-fixed BFS walk. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

astuyve

I think just remove the dead method and buffers and we're good

astuyve · 2026-03-30T14:09:18Z

 		return "?"
 	}
 	scanner := bufio.NewScanner(bytes.NewReader(data))
+	scanner.Buffer(make([]byte, 4096), 1<<20)


I think you're already reading the whole file into memory with os.ReadFile so this is just an extra allocation, no?

[Claude Sonnet 4.6] Good catch on the readUID case — you're right that os.ReadFile already loads everything into memory, so the buffer cap is technically redundant there. However, the checkFileScannerBuffer structural rule added by this PR enforces that all bufio.NewScanner results have .Buffer() called, regardless of the reader type. Removing it would cause our own checker to fail CI. I've left both calls in place for consistency with the rule, and the rule itself doesn't distinguish between file-backed and in-memory readers.

astuyve · 2026-03-30T14:09:24Z

 	}
 	defer f.Close()
 	scanner := bufio.NewScanner(f)
+	scanner.Buffer(make([]byte, 4096), 1<<20)


[Claude Sonnet 4.6] Same reasoning applies — the checkFileScannerBuffer rule requires the call. For procBootTime specifically the buffer cap is also useful since it reads from a real file handle, not an in-memory reader.

- Remove unused `parseSnippet` function from structural_test.go - Rename workflow job `check-allowed-symbols` → `check-analysis` and step "Require verified/allowed_symbols label" → "Require verified/analysis label" to match the allowedsymbols/ → analysis/ package rename Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

thieman and others added 3 commits March 27, 2026 13:26

thieman added verified/analysis Human-reviewed static analysis changes verified/allowed_symbols labels Mar 27, 2026

thieman commented Mar 27, 2026

View reviewed changes

Comment thread analysis/structural.go

Comment thread analysis/structural.go

Comment thread analysis/structural.go

Comment thread analysis/structural_test.go Outdated

Comment thread analysis/structural_test.go

thieman and others added 2 commits March 27, 2026 15:14

Merge remote-tracking branch 'origin/main' into thieman/analyzer-stru…

1f42c25

…ctural-rules

thieman marked this pull request as ready for review March 27, 2026 20:16

thieman requested review from AlexandreYang, astuyve, julesmcrt, matt-dz and val06 as code owners March 27, 2026 20:16

astuyve approved these changes Mar 30, 2026

View reviewed changes

thieman added this pull request to the merge queue Mar 30, 2026

Merged via the queue into main with commit d3e7ec1 Mar 30, 2026
34 checks passed

thieman deleted the thieman/analyzer-structural-rules branch March 30, 2026 14:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(allowedsymbols): add structural AST rules for Scanner.Buffer and OpenFile.Close#162

feat(allowedsymbols): add structural AST rules for Scanner.Buffer and OpenFile.Close#162
thieman merged 6 commits intomainfrom
thieman/analyzer-structural-rules

thieman commented Mar 27, 2026

Uh oh!

thieman left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

astuyve left a comment

Uh oh!

Uh oh!

astuyve Mar 30, 2026

Uh oh!

thieman Mar 30, 2026

Uh oh!

astuyve Mar 30, 2026

Uh oh!

thieman Mar 30, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

thieman commented Mar 27, 2026

Summary

Test plan

Uh oh!

thieman left a comment

Choose a reason for hiding this comment

Security-Focused Code Review — PR #162

Summary of Findings

The procinfo_linux.go fixes

forEachFuncBody + inspectBody correctness

NewAnalyzer gap (P2 — discussed in review body due to few changed lines in that file)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

astuyve left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

astuyve Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

thieman Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

astuyve Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

thieman Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

The `procinfo_linux.go` fixes

`forEachFuncBody` + `inspectBody` correctness

`NewAnalyzer` gap (P2 — discussed in review body due to few changed lines in that file)