fix: address memory edge cases in grep context, sort input, and script parsing#145
Merged
fix: address memory edge cases in grep context, sort input, and script parsing#145
Conversation
…t parsing - grep: add 512 KiB per-match-group aggregate byte cap on before-context sliding window (-B/-C) and after-context output stream (-A/-C) - sort: lower MaxTotalBytes cap from 256 MiB to 5 MiB to bound memory in sort chains where N concurrent instances hold their full input simultaneously; improve error message with limit value and guidance - interp: add ParseScript() helper and MaxScriptBytes (5 MiB) constant so library callers have a size-checked parse entry point; update the CLI execute() to use it Gap 4 (pipeline goroutine depth) is documented but not fixed: pipelines are parsed left-recursively so N stages produce N-1 simultaneous goroutines. A semaphore deadlocks due to pipe backpressure holding goroutine slots. The principled fix (flatten pipeline + sliding-window goroutine pool) requires significantly reworking the execution model and is not worth the complexity. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…Script - grep: TestGrepBeforeContextByteCapEvictsOldLines and TestGrepAfterContextByteCapTruncatesOutput verify that output stays within MaxContextBytes (512 KiB) per match group when large lines are used with -B/-A; TestGrepBeforeContextMemoryBounded (bench-as-test) checks per-operation allocation stays within budget for -B 1000 with 8 KiB lines - sort: TestSortInputExceedsMaxTotalBytes verifies exit 1 with descriptive error when input content exceeds MaxTotalBytes (5 MiB); TestSortInputBelowMaxTotalBytes confirms the boundary is accepted - ParseScript: TestParseScriptRejectsOversizedInput / AcceptsValidInput / RejectsInvalidSyntax cover the interp.ParseScript API directly; TestScriptExceedsMaxScriptBytes and TestScriptAtMaxScriptBytes cover the CLI path through execute() Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Change context.Background() to t.Context() as the parent for per-invocation timeouts in FuzzTailLines, FuzzTailBytes, FuzzTailStdin, and FuzzTailBytesOffset. Add early-exit checks (t.Context().Err() != nil) at the top of each fuzz function and after each cmdRunCtx call. When the 30-second fuzz budget expires the framework cancels t.Context(), which now propagates through the WithTimeout parent chain and unblocks stuck workers instead of leaving them to drain the budget. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AlexandreYang
approved these changes
Mar 25, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Addresses memory resource gaps identified in
valeri.pliskin/memory-edge-cases:MEMORY_GAPS.md(gaps 1, 3, and 5). See main...valeri.pliskin/memory-edge-casesGap 1 —
grepcontext buffer: Add aMaxContextBytes = 512 KiBaggregate byte cap applied per match group to both the before-context sliding window (-B/-C) and the after-context output stream (-A/-C). Before-context evicts oldest lines when the byte ceiling is hit; after-context stops printing context lines. Both counters reset at the start of each new match group. The global executor output limit bounds total output across all groups.Gap 3 —
sortinput size: LowerMaxTotalBytesfrom 256 MiB to 5 MiB. In a pipeline, N concurrentsortinstances each buffer their full input simultaneously with no cross-stage cap — at 256 MiB, a 3-sort chain could consume ~768 MiB. The new limit keeps each instance well-bounded. Error message now includes the limit value and actionable guidance for callers.Gap 5 — script size limit: Add
interp.ParseScript(script, name string)andinterp.MaxScriptBytes = 5 MiBto theinterppackage. Unlike all other inputs (variables, command substitution, per-line builtins), the script itself previously had no cap inside rshell — it relied entirely on callers.ParseScriptenforces the limit before the syntax parser allocates any memory. The CLIexecute()now uses this helper. Library callers should useParseScriptrather than calling the underlying syntax parser directly.Gap 4 — pipeline goroutine depth (documented, not fixed)
Gap 4 (unbounded goroutine count from deep pipelines) is intentionally not addressed here. The constraints:
a | b | c | dbecomes((a|b)|c)|d. Each|node spawns a goroutine for its left side and runs the right side synchronously, so N pipeline stages produce N−1 simultaneously live goroutines with no enforced limit.runner_exec.go. The complexity and risk are not proportionate to the benefit at this time.Test plan
go test ./...passes (all existing tests green)grep -B 1000with 1 MiB lines stays within the 512 KiB window capsortrejects input > 5 MiB with a descriptive errorinterp.ParseScriptrejects scripts > 5 MiB before any parsing occursexecute()in the CLI correctly surfaces the size error with exit code 2🤖 Generated with Claude Code