diff --git a/.claude/skills/code-review/SKILL.md b/.claude/skills/code-review/SKILL.md index c9553c49..84d52ae5 100644 --- a/.claude/skills/code-review/SKILL.md +++ b/.claude/skills/code-review/SKILL.md @@ -114,10 +114,56 @@ For every behavioral change: ### D. Test Coverage -- **Are new behaviors tested?** Every new code path should have a corresponding test -- **Are edge cases tested?** Empty input, boundary values, error conditions -- **YAML scenario conventions**: prefer `expect.stderr` over `stderr_contains`; tests are asserted against bash by default; use `stdout_windows`/`stderr_windows` for platform-specific output -- **Bash comparison**: if YAML scenarios are added or modified, verify they pass against bash +Analyze coverage of changed code from two angles: **scenario tests** (YAML) and **Go tests**. Scenario tests are preferred because they also verify bash compatibility. + +#### Step 1: Inventory changed code paths + +For each changed or added function/branch/error-path, list the code path (e.g. "cut: `-f` with `--complement` and `--output-delimiter`", "error when delimiter is multi-byte"). + +#### Step 2: Check scenario test coverage (priority) + +Search `tests/scenarios/cmd//` for YAML scenarios that exercise each code path identified in Step 1. + +- **Covered** — a scenario exists whose `input.script` triggers the code path and `expect` asserts the output. +- **Partially covered** — a scenario triggers the code path but doesn't assert stderr, exit code, or an important edge case. +- **Not covered** — no scenario exercises the code path. + +Flag **not covered** and **partially covered** paths as findings. Suggest concrete YAML scenario(s) to add (including `description`, `input.script`, and expected `stdout`/`stderr`/`exit_code`). + +Scenario test conventions: +- Prefer `expect.stderr` (exact match) over `stderr_contains` +- Tests are asserted against bash by default — only use `skip_assert_against_bash: true` for intentional divergence +- Use `stdout_windows`/`stderr_windows` for platform-specific output +- If YAML scenarios are added or modified, verify they pass against bash + +#### Step 3: Check Go test coverage + +Search `interp/builtins//*_test.go` for Go tests that exercise any code paths **not already covered by scenario tests**. Go test types to check: + +| Test type | File pattern | What it covers | +|-----------|-------------|----------------| +| Functional | `_test.go` | Core logic, argument parsing, edge cases | +| GNU compat | `_gnu_compat_test.go` | Byte-for-byte output equivalence with GNU coreutils | +| Pentest | `_pentest_test.go` | Security vectors (overflow, special files, resource exhaustion) | +| Platform | `_{unix,windows}_test.go` | OS-specific behavior | + +Only flag missing Go tests for paths that **cannot be adequately covered by scenario tests** (e.g. internal error handling, concurrency, memory limits, platform-specific behavior, performance-sensitive paths). + +#### Step 4: Produce coverage summary + +Include a coverage table in the review output: + +```markdown +| Code path | Scenario test | Go test | Status | +|-----------|:---:|:---:|--------| +| `-f` with `--complement` | tests/scenarios/cmd/cut/complement/fields.yaml | — | Covered | +| multi-byte delimiter error | — | — | **Missing** | +| `/dev/zero` hang protection | skip (intentional divergence) | cut_pentest_test.go:45 | Covered | +``` + +Mark the overall coverage status: +- **Adequate** — all new/changed code paths are covered (scenario or Go tests) +- **Gaps found** — list missing coverage as P2 or P3 findings ### E. Code Quality diff --git a/.claude/skills/review-fix-loop/SKILL.md b/.claude/skills/review-fix-loop/SKILL.md new file mode 100644 index 00000000..418db23f --- /dev/null +++ b/.claude/skills/review-fix-loop/SKILL.md @@ -0,0 +1,341 @@ +--- +name: review-fix-loop +description: "Self-review a PR, fix all issues, and re-review in a loop until clean. Coordinates code-review, address-pr-comments, and fix-ci-tests skills." +argument-hint: "[pr-number|pr-url]" +--- + +Self-review and iteratively fix **$ARGUMENTS** (or the current branch's PR if no argument is given) until the review is clean. + +--- + +## ⛔ STOP — READ THIS BEFORE DOING ANYTHING ELSE ⛔ + +You MUST follow this execution protocol. Skipping steps or running them out of order has caused regressions and wasted iterations in every prior run of this skill. + +### 1. Create the full task list FIRST + +Your very first action — before reading ANY files, before running ANY commands — is to call TaskCreate exactly 11 times, once for each step/sub-step below. Use these exact subjects: + +1. "Step 1: Identify the PR" +2. "Step 2: Run the review-fix loop" +3. "Step 2A1: Self-review (code-review)" ← **parallel with 2A2** +4. "Step 2A2: Request external reviews (@datadog @codex)" ← **parallel with 2A1** +5. "Step 2B: Address PR comments (address-pr-comments)" +6. "Step 2C: Fix CI failures (fix-ci-tests)" +7. "Step 2D: Verify push and resolve conflicts" +8. "Step 2E: Check CI status" +9. "Step 2F: Decide whether to continue" +10. "Step 3: Verify clean state" +11. "Step 4: Final summary" + +**Note on sub-steps 2A–2F:** These are created once and reused across loop iterations. At the start of each iteration, reset all sub-steps to `pending`, then execute them in order. Sub-steps marked **parallel** are launched concurrently and must both complete before proceeding to the next group. + +### 2. Execution order and gating + +Steps run strictly in this order: + +``` +Step 1 → Step 2 (loop: [2A1 ∥ 2A2] → 2B → 2C → 2D → 2E → 2F) → Step 3 → Step 4 + ↑ ↓ + └──────────────── repeat ───────────────────┘ +``` + +**Top-level steps** are sequential: before starting step N, call TaskList and verify step N-1 is `completed`. Set step N to `in_progress`. + +**Sub-steps within Step 2** follow this execution order: + +| Phase | Sub-steps | Execution | +|-------|-----------|-----------| +| Review | **2A1** ∥ **2A2** | **Parallel** — launch both, wait for both | +| Fix comments | **2B** | Sequential | +| Fix CI | **2C** | Sequential — run after 2B completes | +| Verify | **2D** | Sequential | +| CI check | **2E** | Sequential | +| Decide | **2F** | Sequential | + +### 3. Never skip steps + +- Do NOT skip the review (Step 2A1) because you think the code is fine +- Do NOT skip verification (Step 3) because tests passed during fixes +- Do NOT skip the external review trigger — @datadog and @codex reviews catch issues the self-review misses +- Do NOT mark a step completed until every sub-bullet in that step is satisfied + +If you catch yourself wanting to skip a step, STOP and do the step anyway. + +--- + +## Step 1: Identify the PR + +**Set this step to `in_progress` immediately after creating all tasks.** + +```bash +# If argument provided, use it; otherwise detect from current branch +gh pr view $ARGUMENTS --json number,url,headRefName,baseRefName +``` + +If `$ARGUMENTS` is empty, this automatically falls back to the PR associated with the current branch. If no PR is found, stop and inform the user. + +Store the PR number, head branch, and base branch for all subsequent steps. + +```bash +gh repo view --json owner,name --jq '"\(.owner.login)/\(.name)"' +``` + +Store the owner and repo name. + +**Completion check:** You have the PR number, URL, owner, repo, head branch, and base branch. Mark Step 1 as `completed`. + +--- + +## Step 2: Run the review-fix loop + +**GATE CHECK**: Call TaskList. Step 1 must be `completed`. Set Step 2 to `in_progress`. + +Set `iteration = 1`. Maximum iterations: **10**. Repeat sub-steps A through E while `iteration <= 10`: + +--- + +### Sub-step 2A1 — Self-review ← **parallel with 2A2** + +Run the **code-review** skill on the PR: +``` +/code-review +``` +This analyzes the full diff against main, posts findings as a GitHub PR review with inline comments, and classifies findings by severity (P0–P3). + +### Sub-step 2A2 — Request external reviews ← **parallel with 2A1** + +Post a comment to trigger @datadog and @codex reviews: +```bash +gh pr comment --body "@datadog @codex make a comprehensive code and security reviews" +``` +The external reviews arrive asynchronously — their comments will be picked up by **address-pr-comments** in Sub-step 2B1. + +### After 2A1 ∥ 2A2 complete + +Wait for **both** to complete before proceeding. + +**Record the self-review outcome (from 2A1):** +- If the review result is **APPROVE** (no findings) → skip to **Sub-step 2E (CI check)** +- If there are findings → continue to **Sub-step 2B** + +--- + +### Pre-check before 2B + +Before launching fixes, ensure the working tree is clean and up to date: + +```bash +git status +git pull --rebase origin +``` + +### Sub-step 2B — Address PR comments + +Run the **address-pr-comments** skill: +``` +/address-pr-comments +``` +This reads all unresolved review comments, evaluates validity, implements fixes, commits, pushes, and replies/resolves threads. + +Wait for completion before proceeding to 2C. + +### Sub-step 2C — Fix CI failures + +Run the **fix-ci-tests** skill: +``` +/fix-ci-tests +``` +This checks for failing CI jobs, downloads logs, reproduces failures locally, fixes them, and pushes. + +Wait for completion before proceeding to 2D. + +--- + +### Sub-step 2D — Verify push and sync + +After 2B and 2C complete, verify the branch state: + +```bash +git fetch origin +git status +git log --oneline -5 +``` + +1. If there are unpushed commits, push them. +2. Pull the latest remote state to stay in sync: + ```bash + git pull --rebase origin + ``` +3. Confirm the branch is up to date with the remote. + +**Completion check:** `git status` shows a clean working tree and the branch is pushed. Only then proceed. + +--- + +### Sub-step 2E — Check CI status + +```bash +gh pr checks --json name,state +``` + +- If any checks are **failing** → run the **fix-ci-tests** skill one more time: + ``` + /fix-ci-tests + ``` + Wait for it to complete, then re-check CI status. If still failing after this second attempt, log the failure and continue to Sub-step 2F. + +- If all checks are **passing** or **pending** → continue to Sub-step 2F. + +--- + +### Sub-step 2F — Decide whether to continue + +Increment `iteration`. + +Check **all three** review sources for remaining issues: + +1. **Self-review** — Was the latest `/code-review` result **APPROVE** (no findings)? + +2. **External reviews** — Are there unresolved PR comment threads from @datadog or @codex? + ```bash + gh api graphql -f query=' + query($owner: String!, $repo: String!, $pr: Int!) { + repository(owner: $owner, name: $repo) { + pullRequest(number: $pr) { + reviewThreads(first: 100) { + nodes { + isResolved + comments(first: 1) { + nodes { author { login } body } + } + } + } + } + } + } + ' -f owner="{owner}" -f repo="{repo}" -F pr={pr-number} \ + --jq '.data.repository.pullRequest.reviewThreads.nodes[] | select(.isResolved == false)' + ``` + +3. **CI** — Are all checks passing? + ```bash + gh pr checks --json name,state + ``` + +**Decision matrix:** + +| Self-review | External comments | CI | Action | +|------------|-------------------|-----|--------| +| APPROVE | None unresolved | Passing | **STOP — PR is clean** | +| Any findings | Any | Any | **Continue** → go back to Sub-step 2A1 ∥ 2A2 | +| APPROVE | Unresolved threads | Any | **Continue** → go back to Sub-step 2A1 ∥ 2A2 (address-pr-comments will handle them) | +| APPROVE | None unresolved | Failing | **Continue** → go back to Sub-step 2A1 ∥ 2A2 (fix-ci-tests will handle it) | +| — | — | — | If `iteration > 10` → **STOP — iteration limit reached** | + +Log the iteration result before continuing or stopping: +- Iteration number +- Self-review result (APPROVE / COMMENT / REQUEST_CHANGES) +- Number of findings by severity +- Number of fixes applied +- CI status + +--- + +**Step 2 completion check:** The loop exited because either (a) all three conditions are met (clean), or (b) the iteration limit was reached. Mark Step 2 as `completed`. + +--- + +## Step 3: Verify clean state + +**GATE CHECK**: Call TaskList. Step 2 must be `completed`. Set Step 3 to `in_progress`. + +Run a final verification regardless of how the loop exited: + +1. **Confirm branch is pushed:** + ```bash + git status + git log --oneline origin/..HEAD + ``` + If there are unpushed commits, push them. + +2. **Confirm CI status:** + ```bash + gh pr checks --json name,state + ``` + +3. **Confirm no unresolved threads:** + ```bash + gh api graphql -f query=' + query($owner: String!, $repo: String!, $pr: Int!) { + repository(owner: $owner, name: $repo) { + pullRequest(number: $pr) { + reviewThreads(first: 100) { + nodes { + isResolved + comments(first: 1) { + nodes { author { login } body } + } + } + } + } + } + } + ' -f owner="{owner}" -f repo="{repo}" -F pr={pr-number} \ + --jq '.data.repository.pullRequest.reviewThreads.nodes[] | select(.isResolved == false) | .comments.nodes[0].body' \ + 2>&1 | head -50 + ``` + +Record the final state of each dimension (self-review, external reviews, CI). + +**If any verification fails** (CI failing, unresolved threads remain, or unpushed commits that can't be pushed), reset Step 2 and all its sub-steps to `pending`, and go back to **Step 2: Run the review-fix loop** for another iteration. Only proceed to Step 4 when all three verifications pass. + +**Completion check:** All three verifications passed. Mark Step 3 as `completed`. + +--- + +## Step 4: Final summary + +**GATE CHECK**: Call TaskList. Step 3 must be `completed`. Set Step 4 to `in_progress`. + +Provide a summary in this exact format: + +```markdown +## Review-Fix Loop Summary + +- **PR**: # () +- **Iterations completed**: +- **Final status**: + +### Iteration log + +| # | Review result | Findings | Fixes applied | CI status | +|---|--------------|----------|---------------|-----------| +| 1 | REQUEST_CHANGES | 3 (1×P1, 2×P2) | 3 fixed | Passing | +| 2 | COMMENT | 1 (1×P3) | 1 fixed | Passing | +| 3 | APPROVE | 0 | — | Passing | + +### Final state + +- **Self-review**: APPROVE / REQUEST_CHANGES / COMMENT +- **Unresolved external comments**: (list authors) +- **CI**: Passing / Failing (list failing checks) + +### Remaining issues (if any) + +- +``` + +**Completion check:** Summary is output. Mark Step 4 as `completed`. + +--- + +## Important rules + +- **Never skip the review step** — always re-review after fixes to catch regressions or new issues introduced by the fixes themselves. +- **Always submit reviews to GitHub** — each iteration's review must be posted as PR comments so there's a visible trail. +- **Run address-pr-comments before fix-ci-tests** — 2B then 2C, sequentially, so CI fixes run on code that already incorporates review feedback. +- **Pull before fixing** — always `git pull --rebase` before launching fix agents to avoid working on stale code. +- **Stop early on APPROVE + CI green + no unresolved threads** — don't waste iterations if the PR is already clean. +- **Respect the iteration limit** — hard stop at 10 to prevent infinite loops. If issues persist after 10 iterations, report what's left for the user to handle. +- **Use gate checks** — always call TaskList and verify prerequisites before starting a step. This prevents out-of-order execution. diff --git a/SHELL_FEATURES.md b/SHELL_FEATURES.md index 7b26786a..6dfa9ca9 100644 --- a/SHELL_FEATURES.md +++ b/SHELL_FEATURES.md @@ -8,6 +8,7 @@ Blocked features are rejected before execution with exit code 2. - ✅ `break` — exit the innermost `for` loop - ✅ `cat [-n] [FILE]...` — concatenate files to stdout; `-n` numbers output lines - ✅ `continue` — skip to the next iteration of the innermost `for` loop +- ✅ `cut [-b LIST|-c LIST|-f LIST] [-d DELIM] [-s] [-n] [--complement] [--output-delimiter=STRING] [FILE]...` — remove sections from each line of files - ✅ `echo [-n] [-e] [ARG]...` — write arguments to stdout - ✅ `exit [N]` — exit the shell with status N (default 0) - ✅ `false` — return exit code 1 diff --git a/interp/builtin_cut_gnu_compat_test.go b/interp/builtin_cut_gnu_compat_test.go new file mode 100644 index 00000000..eb78d528 --- /dev/null +++ b/interp/builtin_cut_gnu_compat_test.go @@ -0,0 +1,169 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +package interp_test + +import ( + "bytes" + "context" + "errors" + "os" + "path/filepath" + "strings" + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + "mvdan.cc/sh/v3/syntax" + + "github.com/DataDog/rshell/interp" +) + +func cutRun(t *testing.T, script, dir string) (string, string, int) { + t.Helper() + parser := syntax.NewParser() + prog, err := parser.Parse(strings.NewReader(script), "") + require.NoError(t, err) + + var outBuf, errBuf bytes.Buffer + opts := []interp.RunnerOption{ + interp.StdIO(nil, &outBuf, &errBuf), + interp.AllowedPaths([]string{dir}), + } + + runner, err := interp.New(opts...) + require.NoError(t, err) + defer runner.Close() + + if dir != "" { + runner.Dir = dir + } + + err = runner.Run(context.Background(), prog) + exitCode := 0 + if err != nil { + var es interp.ExitStatus + if errors.As(err, &es) { + exitCode = int(es) + } else { + t.Fatalf("unexpected error: %v", err) + } + } + return outBuf.String(), errBuf.String(), exitCode +} + +func cutWriteFile(t *testing.T, dir, name, content string) { + t.Helper() + require.NoError(t, os.WriteFile(filepath.Join(dir, name), []byte(content), 0644)) +} + +func setupCutDir(t *testing.T, files map[string]string) string { + t.Helper() + dir := t.TempDir() + for name, content := range files { + cutWriteFile(t, dir, name, content) + } + return dir +} + +// GNU: printf 'a:b:c\n' | cut -d: -f1,3- +// Output: a:c +func TestGNUCompatCutFieldBasic(t *testing.T) { + dir := setupCutDir(t, map[string]string{"input": "a:b:c\n"}) + stdout, _, code := cutRun(t, "cut -d: -f1,3- input", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "a:c\n", stdout) +} + +// GNU: printf '123\n' | cut -c4 +// Output: (empty line) +func TestGNUCompatCutByteSelect(t *testing.T) { + dir := setupCutDir(t, map[string]string{"input": "123\n"}) + stdout, _, code := cutRun(t, "cut -c4 input", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "\n", stdout) +} + +// GNU: printf 'a:b:c\n' | cut -d: --output-delimiter=_ -f2,3 +// Output: b_c +func TestGNUCompatCutOutputDelimiter(t *testing.T) { + dir := setupCutDir(t, map[string]string{"input": "a:b:c\n"}) + stdout, _, code := cutRun(t, "cut -d: --output-delimiter=_ -f2,3 input", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "b_c\n", stdout) +} + +// GNU: printf 'abc\n' | cut -s -d: -f2,3 +// Output: (nothing) +func TestGNUCompatCutSuppressNoDelim(t *testing.T) { + dir := setupCutDir(t, map[string]string{"input": "abc\n"}) + stdout, _, code := cutRun(t, "cut -s -d: -f2,3 input", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "", stdout) +} + +// GNU: printf ':::\n' | cut -d: -f1-3 +// Output: :: +func TestGNUCompatCutEmptyFields(t *testing.T) { + dir := setupCutDir(t, map[string]string{"input": ":::\n"}) + stdout, _, code := cutRun(t, "cut -d: -f1-3 input", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "::\n", stdout) +} + +// GNU: printf 'a\nb' | cut -f1- +// Output: a\nb\n (trailing newline added to last line) +func TestGNUCompatCutNewlineHandling(t *testing.T) { + dir := setupCutDir(t, map[string]string{"input": "a\nb"}) + stdout, _, code := cutRun(t, "cut -f1- input", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "a\nb\n", stdout) +} + +// GNU: printf 'a:1\nb:2' | cut -d: -f2 +// Output: 1\n2\n +func TestGNUCompatCutFieldNoTrailing(t *testing.T) { + dir := setupCutDir(t, map[string]string{"input": "a:1\nb:2"}) + stdout, _, code := cutRun(t, "cut -d: -f2 input", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "1\n2\n", stdout) +} + +// GNU: printf '123456\n' | cut --complement -b3,4-4,5,2- +// Output: 1 +func TestGNUCompatCutComplement(t *testing.T) { + dir := setupCutDir(t, map[string]string{"input": "123456\n"}) + stdout, _, code := cutRun(t, "cut --complement -b3,4-4,5,2- input", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "1\n", stdout) +} + +// GNU: printf 'abcd\n' | cut -b1-2,3-4 --output-delimiter=: +// Output: ab:cd +func TestGNUCompatCutOutputDelimBytesAdjacent(t *testing.T) { + dir := setupCutDir(t, map[string]string{"input": "abcd\n"}) + stdout, _, code := cutRun(t, "cut -b1-2,3-4 --output-delimiter=: input", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "ab:cd\n", stdout) +} + +// GNU: printf 'abc\n' | cut -b1-2,2 --output-delimiter=: +// Output: ab (overlapping ranges merged, no extra delimiter) +func TestGNUCompatCutOutputDelimOverlap(t *testing.T) { + dir := setupCutDir(t, map[string]string{"input": "abc\n"}) + stdout, _, code := cutRun(t, "cut -b1-2,2 --output-delimiter=: input", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "ab\n", stdout) +} + +// Unknown flag should produce exit 1 and an error message. +func TestGNUCompatCutRejectedFlags(t *testing.T) { + dir := setupCutDir(t, map[string]string{"input": "a\n"}) + for _, flag := range []string{"--no-such-flag", "-Z"} { + _, stderr, code := cutRun(t, "cut "+flag+" input", dir) + assert.Equal(t, 1, code, "flag: %s", flag) + assert.Contains(t, stderr, "cut:", "flag: %s", flag) + } +} diff --git a/interp/builtin_cut_pentest_test.go b/interp/builtin_cut_pentest_test.go new file mode 100644 index 00000000..473a2953 --- /dev/null +++ b/interp/builtin_cut_pentest_test.go @@ -0,0 +1,313 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +package interp_test + +import ( + "bytes" + "context" + "errors" + "os" + "path/filepath" + "strings" + "testing" + "time" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + "mvdan.cc/sh/v3/syntax" + + "github.com/DataDog/rshell/interp" +) + +func cutPentestRun(t *testing.T, script, dir string) (string, string, int) { + t.Helper() + return cutPentestRunCtx(context.Background(), t, script, dir) +} + +func cutPentestRunCtx(ctx context.Context, t *testing.T, script, dir string) (string, string, int) { + t.Helper() + parser := syntax.NewParser() + prog, err := parser.Parse(strings.NewReader(script), "") + require.NoError(t, err) + + var outBuf, errBuf bytes.Buffer + opts := []interp.RunnerOption{ + interp.StdIO(nil, &outBuf, &errBuf), + interp.AllowedPaths([]string{dir}), + } + + runner, err := interp.New(opts...) + require.NoError(t, err) + defer runner.Close() + + if dir != "" { + runner.Dir = dir + } + + err = runner.Run(ctx, prog) + exitCode := 0 + if err != nil { + var es interp.ExitStatus + if errors.As(err, &es) { + exitCode = int(es) + } else if ctx.Err() == nil { + t.Fatalf("unexpected error: %v", err) + } + } + return outBuf.String(), errBuf.String(), exitCode +} + +func cutPentestWriteFile(t *testing.T, dir, name, content string) { + t.Helper() + require.NoError(t, os.WriteFile(filepath.Join(dir, name), []byte(content), 0644)) +} + +// --- Integer edge cases --- + +func TestCutPentestZeroByte(t *testing.T) { + dir := t.TempDir() + _, stderr, code := cutPentestRun(t, "cut -b0 file", dir) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "cut:") +} + +func TestCutPentestZeroField(t *testing.T) { + dir := t.TempDir() + _, stderr, code := cutPentestRun(t, "cut -f0 file", dir) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "cut:") +} + +func TestCutPentestNegativeField(t *testing.T) { + dir := t.TempDir() + // -f-1 is parsed as -f with value "-1"; the leading dash in the value is + // tricky because pflag may interpret "-1" as a flag. Let's try with space. + _, stderr, code := cutPentestRun(t, "cut -f -1 file", dir) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "cut:") +} + +func TestCutPentestHugeNumber(t *testing.T) { + dir := t.TempDir() + cutPentestWriteFile(t, dir, "file.txt", "abc\n") + // Very large number should not cause OOM — strconv.Atoi handles overflow. + _, stderr, code := cutPentestRun(t, "cut -b99999999999999999999 file.txt", dir) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "cut:") +} + +func TestCutPentestMaxInt32Range(t *testing.T) { + dir := t.TempDir() + cutPentestWriteFile(t, dir, "file.txt", "abc\n") + // Large bounded range should work (clamped to line length). + stdout, _, code := cutPentestRun(t, "cut -b1-2147483647 file.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "abc\n", stdout) +} + +func TestCutPentestLargeUnboundedRange(t *testing.T) { + dir := t.TempDir() + cutPentestWriteFile(t, dir, "file.txt", "") + stdout, _, code := cutPentestRun(t, "cut -b1234567890- file.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "", stdout) +} + +func TestCutPentestEmptyList(t *testing.T) { + dir := t.TempDir() + _, stderr, code := cutPentestRun(t, "cut -b '' file", dir) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "cut:") +} + +// --- Flag and argument injection --- + +func TestCutPentestUnknownFlags(t *testing.T) { + dir := t.TempDir() + for _, flag := range []string{"-z", "--follow", "--no-such-flag"} { + _, stderr, code := cutPentestRun(t, "cut "+flag+" file", dir) + assert.Equal(t, 1, code, "flag: %s", flag) + assert.Contains(t, stderr, "cut:", "flag: %s", flag) + } +} + +func TestCutPentestDoubleDashFlagLikeFile(t *testing.T) { + dir := t.TempDir() + cutPentestWriteFile(t, dir, "-v", "hello\n") + stdout, _, code := cutPentestRun(t, "cut -f1 -- -v", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "hello\n", stdout) +} + +func TestCutPentestMultipleStdin(t *testing.T) { + dir := t.TempDir() + cutPentestWriteFile(t, dir, "file.txt", "a:b\n") + stdout, _, code := cutPentestRun(t, "cat file.txt | cut -d: -f1 - -", dir) + assert.Equal(t, 0, code) + // First stdin reads data, second gets EOF. + assert.Equal(t, "a\n", stdout) +} + +func TestCutPentestFlagExpansionInLoop(t *testing.T) { + dir := t.TempDir() + cutPentestWriteFile(t, dir, "file.txt", "hello\n") + _, stderr, code := cutPentestRun(t, "for flag in --follow; do cut $flag file.txt; done", dir) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "cut:") +} + +// --- Path edge cases --- + +func TestCutPentestNonexistentFile(t *testing.T) { + dir := t.TempDir() + _, stderr, code := cutPentestRun(t, "cut -b1 nonexistent.txt", dir) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "cut:") +} + +func TestCutPentestEmptyFilename(t *testing.T) { + dir := t.TempDir() + _, stderr, code := cutPentestRun(t, "cut -b1 ''", dir) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "cut:") +} + +// --- Context cancellation --- + +func TestCutPentestContextCancelled(t *testing.T) { + dir := t.TempDir() + ctx, cancel := context.WithCancel(context.Background()) + cancel() + _, _, _ = cutPentestRunCtx(ctx, t, "cut -b1 file", dir) +} + +func TestCutPentestContextTimeout(t *testing.T) { + dir := t.TempDir() + cutPentestWriteFile(t, dir, "file.txt", strings.Repeat("abcdef\n", 10000)) + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + stdout, _, code := cutPentestRunCtx(ctx, t, "cut -b1-3 file.txt", dir) + assert.Equal(t, 0, code) + assert.Contains(t, stdout, "abc") +} + +// --- Large input --- + +func TestCutPentestLargeFile(t *testing.T) { + dir := t.TempDir() + content := strings.Repeat("field1:field2:field3\n", 40000) + cutPentestWriteFile(t, dir, "large.txt", content) + ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) + defer cancel() + stdout, _, code := cutPentestRunCtx(ctx, t, "cut -d: -f2 large.txt", dir) + assert.Equal(t, 0, code) + assert.Contains(t, stdout, "field2") +} + +// --- Many files (FD leak check) --- + +func TestCutPentestManyFiles(t *testing.T) { + dir := t.TempDir() + var args []string + for i := range 50 { + name := strings.ReplaceAll(filepath.Base(t.Name()), "/", "_") + "_" + string(rune('a'+i%26)) + string(rune('0'+i/26)) + ".txt" + require.NoError(t, os.WriteFile(filepath.Join(dir, name), []byte("x:y\n"), 0644)) + args = append(args, name) + } + ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) + defer cancel() + stdout, _, code := cutPentestRunCtx(ctx, t, "cut -d: -f1 "+strings.Join(args, " "), dir) + assert.Equal(t, 0, code) + assert.Contains(t, stdout, "x") +} + +// --- Long lines --- + +func TestCutPentestLongLine(t *testing.T) { + dir := t.TempDir() + longLine := strings.Repeat("x", 1024*1024-1) + "\n" // MaxLineBytes - 1 + cutPentestWriteFile(t, dir, "file.txt", longLine) + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + stdout, _, code := cutPentestRunCtx(ctx, t, "cut -b1 file.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "x\n", stdout) +} + +func TestCutPentestLineBeyondMaxBytes(t *testing.T) { + dir := t.TempDir() + // Line exactly at MaxLineBytes + 1 will cause scanner error. + longLine := strings.Repeat("x", 1024*1024+1) + cutPentestWriteFile(t, dir, "file.txt", longLine) + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + _, stderr, code := cutPentestRunCtx(ctx, t, "cut -b1 file.txt", dir) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "cut:") +} + +// --- Edge: empty file --- + +func TestCutPentestEmptyFile(t *testing.T) { + dir := t.TempDir() + cutPentestWriteFile(t, dir, "file.txt", "") + stdout, _, code := cutPentestRun(t, "cut -b1 file.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "", stdout) +} + +// --- Edge: file with only newlines --- + +func TestCutPentestOnlyNewlines(t *testing.T) { + dir := t.TempDir() + cutPentestWriteFile(t, dir, "file.txt", strings.Repeat("\n", 100)) + stdout, _, code := cutPentestRun(t, "cut -b1 file.txt", dir) + assert.Equal(t, 0, code) + // Each empty line produces just a newline. + assert.Equal(t, strings.Repeat("\n", 100), stdout) +} + +// --- Behavior: binary input --- + +func TestCutPentestBinaryInput(t *testing.T) { + dir := t.TempDir() + // Binary data with embedded NULs. + content := []byte{0x00, 0x01, 0x02, 0x03, 0x0A, 0xFF, 0xFE, 0x0A} + require.NoError(t, os.WriteFile(filepath.Join(dir, "binary.bin"), content, 0644)) + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + _, _, code := cutPentestRunCtx(ctx, t, "cut -b1 binary.bin", dir) + // Should not crash on binary data. + assert.Equal(t, 0, code) +} + +// --- Behavior: complement with large range --- + +func TestCutPentestComplementLargeRange(t *testing.T) { + dir := t.TempDir() + cutPentestWriteFile(t, dir, "file.txt", "abcdef\n") + stdout, _, code := cutPentestRun(t, "cut --complement -b2147483647 file.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "abcdef\n", stdout) +} + +// --- Decreasing range --- + +func TestCutPentestDecreasingRange(t *testing.T) { + dir := t.TempDir() + _, stderr, code := cutPentestRun(t, "cut -f5-3 file", dir) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "decreasing") +} + +// --- Bare dash in list --- + +func TestCutPentestBareDash(t *testing.T) { + dir := t.TempDir() + _, stderr, code := cutPentestRun(t, "cut -f - file", dir) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "cut:") +} diff --git a/interp/builtins/cut/cut.go b/interp/builtins/cut/cut.go new file mode 100644 index 00000000..a27ad0cc --- /dev/null +++ b/interp/builtins/cut/cut.go @@ -0,0 +1,554 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +// Package cut implements the cut builtin command. +// +// cut — remove sections from each line of files +// +// Usage: cut OPTION... [FILE]... +// +// Print selected parts of lines from each FILE to standard output. +// With no FILE, or when FILE is -, read standard input. +// +// Exactly one of -b, -c, or -f must be specified. +// +// Accepted flags: +// +// -b LIST, --bytes=LIST +// Select only these bytes. LIST is a comma-separated set of byte +// positions and ranges (e.g. 1,3-5,7-). Positions are 1-based. +// +// -c LIST, --characters=LIST +// Select only these characters (treated as bytes, matching GNU cut). +// Same list format as -b. +// +// -d DELIM, --delimiter=DELIM +// Use DELIM instead of TAB for field delimiter. Used with -f. +// +// -f LIST, --fields=LIST +// Select only these fields, separated by the delimiter character. +// Same list format as -b. +// +// -n +// (ignored) Accepted for POSIX compatibility but has no effect, +// matching GNU coreutils behavior. +// +// -s, --only-delimited +// Do not print lines not containing delimiters (only with -f). +// +// --complement +// Complement the set of selected bytes, characters, or fields. +// +// --output-delimiter=STRING +// Use STRING as the output delimiter. The default is the input +// delimiter. +// +// --help +// Print this usage message to stdout and exit 0. +// +// Exit codes: +// +// 0 All files processed successfully. +// 1 At least one error occurred (missing file, invalid argument, etc.). +// +// Memory safety: +// +// Lines are read via a streaming scanner with a per-line cap of +// MaxLineBytes (1 MiB). Lines exceeding this cap produce an error +// rather than an unbounded allocation. All loops check ctx.Err() +// at each iteration to honour the shell's execution timeout. +package cut + +import ( + "bufio" + "context" + "io" + "math" + "os" + "slices" + "strconv" + "strings" + + "github.com/DataDog/rshell/interp/builtins" +) + +// Cmd is the cut builtin command descriptor. +var Cmd = builtins.Command{Name: "cut", MakeFlags: registerFlags} + +// MaxLineBytes is the per-line buffer cap for the line scanner. +const MaxLineBytes = 1 << 20 // 1 MiB + +// mode distinguishes the three mutually exclusive selection modes. +type mode int + +const ( + modeNone mode = iota + modeBytes // -b + modeChars // -c + modeFields // -f +) + +// registerFlags registers all cut flags on the framework-provided FlagSet and +// returns a bound handler whose flag variables are captured by closure. +func registerFlags(fs *builtins.FlagSet) builtins.HandlerFunc { + help := fs.Bool("help", false, "print usage and exit") + bytesListStr := fs.StringP("bytes", "b", "", "select only these bytes") + charsListStr := fs.StringP("characters", "c", "", "select only these characters") + fieldsListStr := fs.StringP("fields", "f", "", "select only these fields") + delimiter := fs.StringP("delimiter", "d", "\t", "use DELIM instead of TAB for field delimiter") + onlyDelimited := fs.BoolP("only-delimited", "s", false, "do not print lines not containing delimiters") + _ = fs.BoolP("", "n", false, "do not split multi-byte characters") + complement := fs.Bool("complement", false, "complement the set of selected bytes, characters, or fields") + outputDelimiter := fs.String("output-delimiter", "", "use STRING as the output delimiter") + + return func(ctx context.Context, callCtx *builtins.CallContext, files []string) builtins.Result { + if *help { + callCtx.Out("Usage: cut OPTION... [FILE]...\n") + callCtx.Out("Print selected parts of lines from each FILE to standard output.\n") + callCtx.Out("With no FILE, or when FILE is -, read standard input.\n\n") + fs.SetOutput(callCtx.Stdout) + fs.PrintDefaults() + return builtins.Result{} + } + + // Determine mode: exactly one of -b, -c, -f must be specified. + // Use fs.Changed() to detect whether the flag was explicitly provided, + // rather than comparing the value to "" (which would miss -b ""). + var m mode + var listStr string + modeCount := 0 + if fs.Changed("bytes") { + m = modeBytes + listStr = *bytesListStr + modeCount++ + } + if fs.Changed("characters") { + m = modeChars + listStr = *charsListStr + modeCount++ + } + if fs.Changed("fields") { + m = modeFields + listStr = *fieldsListStr + modeCount++ + } + if modeCount == 0 { + callCtx.Errf("cut: you must specify a list of bytes, characters, or fields\n") + return builtins.Result{Code: 1} + } + if modeCount > 1 { + callCtx.Errf("cut: only one type of list may be specified\n") + return builtins.Result{Code: 1} + } + + // -d and -s are only valid with -f. + if m != modeFields { + if fs.Changed("delimiter") { + callCtx.Errf("cut: an input delimiter may be specified only when operating on fields\n") + return builtins.Result{Code: 1} + } + if *onlyDelimited { + callCtx.Errf("cut: suppressing non-delimited lines makes sense\n\tonly when operating on fields\n") + return builtins.Result{Code: 1} + } + } + + // Delimiter must be exactly one byte (GNU cut behavior). + if len(*delimiter) != 1 { + callCtx.Errf("cut: the delimiter must be a single character\n") + return builtins.Result{Code: 1} + } + delimByte := (*delimiter)[0] + + // Parse the list. + ranges, err := parseList(listStr) + if err != nil { + callCtx.Errf("cut: %s\n", err.Error()) + return builtins.Result{Code: 1} + } + + // Determine output delimiter. + outDelim := *delimiter + outDelimSet := fs.Changed("output-delimiter") + if outDelimSet { + outDelim = *outputDelimiter + } + + cfg := &cutConfig{ + mode: m, + ranges: ranges, + delimByte: delimByte, + onlyDelimited: *onlyDelimited, + complement: *complement, + outDelim: outDelim, + outDelimSet: outDelimSet, + } + + // Default to stdin when no file arguments were given. + if len(files) == 0 { + files = []string{"-"} + } + + var failed bool + for _, file := range files { + if ctx.Err() != nil { + break + } + if err := processFile(ctx, callCtx, file, cfg); err != nil { + name := file + if file == "-" { + name = "standard input" + } + callCtx.Errf("cut: %s: %s\n", name, callCtx.PortableErr(err)) + failed = true + } + } + + if failed { + return builtins.Result{Code: 1} + } + return builtins.Result{} + } +} + +// cutConfig holds the parsed configuration for a cut invocation. +type cutConfig struct { + mode mode + ranges [][2]int // sorted, merged, 1-based inclusive ranges + delimByte byte + onlyDelimited bool + complement bool + outDelim string + outDelimSet bool +} + +// parseList parses a comma-separated list of ranges/positions into sorted, +// merged [2]int ranges (1-based inclusive). Open-ended ranges use +// math.MaxInt32 as sentinel. +func parseList(s string) ([][2]int, error) { + parts := strings.Split(s, ",") + var ranges [][2]int + for _, part := range parts { + if part == "" { + return nil, invalidRange(s) + } + dashIdx := strings.IndexByte(part, '-') + if dashIdx < 0 { + // Single number: N + n, err := strconv.Atoi(part) + if err != nil || n <= 0 { + return nil, invalidRange(part) + } + ranges = append(ranges, [2]int{n, n}) + } else { + left := part[:dashIdx] + right := part[dashIdx+1:] + // A bare "-" (both sides empty) is invalid. + if left == "" && right == "" { + return nil, invalidRange(part) + } + var start, end int + if left == "" { + start = 1 + } else { + var err error + start, err = strconv.Atoi(left) + if err != nil || start <= 0 { + return nil, invalidRange(part) + } + } + if right == "" { + end = math.MaxInt32 + } else { + var err error + end, err = strconv.Atoi(right) + if err != nil || end <= 0 { + return nil, invalidRange(part) + } + } + if start > end { + return nil, invalidDecreasingRange(part) + } + ranges = append(ranges, [2]int{start, end}) + } + } + if len(ranges) == 0 { + return nil, invalidRange(s) + } + + // Sort by start, then merge overlapping/adjacent. + slices.SortFunc(ranges, func(a, b [2]int) int { + if a[0] != b[0] { + return a[0] - b[0] + } + return a[1] - b[1] + }) + + // Merge overlapping ranges (but not merely adjacent ones, so that + // --output-delimiter can be inserted between adjacent ranges like 1-2,3-4). + merged := [][2]int{ranges[0]} + for _, r := range ranges[1:] { + last := &merged[len(merged)-1] + if r[0] <= last[1] { + // Truly overlapping: extend. + if r[1] > last[1] { + last[1] = r[1] + } + } else { + merged = append(merged, r) + } + } + return merged, nil +} + +func invalidRange(s string) error { + return cutError("invalid byte, character or field list: " + s) +} + +func invalidDecreasingRange(s string) error { + return cutError("invalid decreasing range: " + s) +} + +// cutError is a simple error type. +type cutError string + +func (e cutError) Error() string { return string(e) } + +// processFile opens and processes one file (or stdin for "-"). +func processFile(ctx context.Context, callCtx *builtins.CallContext, file string, cfg *cutConfig) error { + var rc io.ReadCloser + if file == "-" { + if callCtx.Stdin == nil { + return nil + } + rc = io.NopCloser(callCtx.Stdin) + } else { + f, err := callCtx.OpenFile(ctx, file, os.O_RDONLY, 0) + if err != nil { + return err + } + defer f.Close() + rc = f + } + + sc := bufio.NewScanner(rc) + buf := make([]byte, 4096) + sc.Buffer(buf, MaxLineBytes) + sc.Split(scanLinesPreservingNewline) + + for sc.Scan() { + if ctx.Err() != nil { + return ctx.Err() + } + line := sc.Bytes() + // Strip trailing newline for processing; we always add one back. + raw := stripNewline(line) + switch cfg.mode { + case modeBytes, modeChars: + // GNU coreutils treats -c identically to -b (byte-wise selection). + processBytes(callCtx, raw, cfg) + case modeFields: + processFields(callCtx, raw, cfg) + } + } + return sc.Err() +} + +// stripNewline removes a trailing \n from a byte slice. +// Only \n is stripped — \r is preserved as a regular content byte, +// matching GNU cut behavior where \r is not part of the line terminator. +func stripNewline(b []byte) []byte { + if len(b) > 0 && b[len(b)-1] == '\n' { + b = b[:len(b)-1] + } + return b +} + +// inRanges checks whether pos (1-based) falls within any of the sorted ranges. +func inRanges(pos int, ranges [][2]int) bool { + for _, r := range ranges { + if pos < r[0] { + return false // ranges are sorted, no need to continue + } + if pos <= r[1] { + return true + } + } + return false +} + +// processBytes selects bytes from a line. +func processBytes(callCtx *builtins.CallContext, raw []byte, cfg *cutConfig) { + n := len(raw) + if n == 0 { + callCtx.Out("\n") + return + } + + if cfg.complement { + // Select bytes NOT in ranges. + if cfg.outDelimSet { + processBytesComplementWithOutDelim(callCtx, raw, cfg) + } else { + var sb strings.Builder + for i := range n { + pos := i + 1 + if !inRanges(pos, cfg.ranges) { + sb.WriteByte(raw[i]) + } + } + callCtx.Out(sb.String()) + } + } else { + if cfg.outDelimSet { + processBytesWithOutDelim(callCtx, raw, cfg) + } else { + var sb strings.Builder + for i := range n { + pos := i + 1 + if inRanges(pos, cfg.ranges) { + sb.WriteByte(raw[i]) + } + } + callCtx.Out(sb.String()) + } + } + callCtx.Out("\n") +} + +// processBytesWithOutDelim outputs selected byte ranges with the output +// delimiter inserted between non-contiguous ranges. +func processBytesWithOutDelim(callCtx *builtins.CallContext, raw []byte, cfg *cutConfig) { + n := len(raw) + first := true + for _, r := range cfg.ranges { + start := r[0] + end := r[1] + if start > n { + break + } + if end > n { + end = n + } + if !first { + callCtx.Out(cfg.outDelim) + } + _, _ = callCtx.Stdout.Write(raw[start-1 : end]) + first = false + } +} + +// processBytesComplementWithOutDelim outputs complemented byte ranges with output delimiter. +func processBytesComplementWithOutDelim(callCtx *builtins.CallContext, raw []byte, cfg *cutConfig) { + compRanges := complementRanges(cfg.ranges, len(raw)) + first := true + for _, r := range compRanges { + if !first { + callCtx.Out(cfg.outDelim) + } + _, _ = callCtx.Stdout.Write(raw[r[0]-1 : r[1]]) + first = false + } +} + +// processFields selects fields from a line. +func processFields(callCtx *builtins.CallContext, raw []byte, cfg *cutConfig) { + line := string(raw) + delimStr := string(cfg.delimByte) + + // Check if line contains the delimiter. + if strings.IndexByte(line, cfg.delimByte) < 0 { + if cfg.onlyDelimited { + return // suppress line + } + // No delimiter: print the whole line + newline. + callCtx.Out(line) + callCtx.Out("\n") + return + } + + fields := strings.Split(line, delimStr) + nFields := len(fields) + + // Determine which fields to select. + var selected []int + if cfg.complement { + compRanges := complementRanges(cfg.ranges, nFields) + for _, r := range compRanges { + for i := r[0]; i <= r[1] && i <= nFields; i++ { + selected = append(selected, i) + } + } + } else { + for _, r := range cfg.ranges { + start := r[0] + end := r[1] + if start > nFields { + break + } + if end > nFields { + end = nFields + } + for i := start; i <= end; i++ { + selected = append(selected, i) + } + } + } + + // Output selected fields joined by the output delimiter. + for i, idx := range selected { + if i > 0 { + callCtx.Out(cfg.outDelim) + } + callCtx.Out(fields[idx-1]) + } + callCtx.Out("\n") +} + +// complementRanges returns the complement of the given sorted, merged ranges +// within [1, total]. +func complementRanges(ranges [][2]int, total int) [][2]int { + var result [][2]int + next := 1 + for _, r := range ranges { + start := r[0] + end := r[1] + if start > total { + break + } + if next < start { + result = append(result, [2]int{next, start - 1}) + } + if end >= total { + next = total + 1 + break + } + next = end + 1 + } + if next <= total { + result = append(result, [2]int{next, total}) + } + return result +} + +// scanLinesPreservingNewline is a bufio.SplitFunc that includes the line +// terminator (\n) in the returned token. Unlike bufio.ScanLines, it does not +// strip \r\n or \n, so the caller reproduces the exact file content. If the +// file's last line has no terminator, the bare bytes are returned as the +// final token. +func scanLinesPreservingNewline(data []byte, atEOF bool) (advance int, token []byte, err error) { + if atEOF && len(data) == 0 { + return 0, nil, nil + } + for i, b := range data { + if b == '\n' { + return i + 1, data[:i+1], nil + } + } + if atEOF { + return len(data), data, nil + } + return 0, nil, nil +} diff --git a/interp/builtins/tests/cut/cut_test.go b/interp/builtins/tests/cut/cut_test.go new file mode 100644 index 00000000..e37b1376 --- /dev/null +++ b/interp/builtins/tests/cut/cut_test.go @@ -0,0 +1,530 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +package cut_test + +import ( + "bytes" + "context" + "errors" + "os" + "path/filepath" + "strings" + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + "mvdan.cc/sh/v3/syntax" + + "github.com/DataDog/rshell/interp" +) + +func runScript(t *testing.T, script, dir string, opts ...interp.RunnerOption) (string, string, int) { + t.Helper() + return runScriptCtx(context.Background(), t, script, dir, opts...) +} + +func runScriptCtx(ctx context.Context, t *testing.T, script, dir string, opts ...interp.RunnerOption) (string, string, int) { + t.Helper() + parser := syntax.NewParser() + prog, err := parser.Parse(strings.NewReader(script), "") + require.NoError(t, err) + var outBuf, errBuf bytes.Buffer + allOpts := append([]interp.RunnerOption{interp.StdIO(nil, &outBuf, &errBuf)}, opts...) + runner, err := interp.New(allOpts...) + require.NoError(t, err) + defer runner.Close() + if dir != "" { + runner.Dir = dir + } + err = runner.Run(ctx, prog) + exitCode := 0 + if err != nil { + var es interp.ExitStatus + if errors.As(err, &es) { + exitCode = int(es) + } else if ctx.Err() == nil { + t.Fatalf("unexpected error: %v", err) + } + } + return outBuf.String(), errBuf.String(), exitCode +} + +func cmdRun(t *testing.T, script, dir string) (stdout, stderr string, exitCode int) { + t.Helper() + return runScript(t, script, dir, interp.AllowedPaths([]string{dir})) +} + +func writeFile(t *testing.T, dir, name, content string) { + t.Helper() + require.NoError(t, os.WriteFile(filepath.Join(dir, name), []byte(content), 0644)) +} + +// --- Basic field selection --- + +func TestCutFieldBasic(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "a:b:c\n") + stdout, _, code := cmdRun(t, "cut -d: -f1,3 input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "a:c\n", stdout) +} + +func TestCutFieldRange(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "a:b:c\n") + stdout, _, code := cmdRun(t, "cut -d: -f2- input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "b:c\n", stdout) +} + +func TestCutFieldBeyondEnd(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "a:b:c\n") + stdout, _, code := cmdRun(t, "cut -d: -f4 input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "\n", stdout) +} + +func TestCutFieldEmptyInput(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "") + stdout, _, code := cmdRun(t, "cut -d: -f4 input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "", stdout) +} + +// --- Byte selection --- + +func TestCutByteSingle(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "abcd\n") + stdout, _, code := cmdRun(t, "cut -b2 input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "b\n", stdout) +} + +func TestCutByteRange(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "abcdef\n") + stdout, _, code := cmdRun(t, "cut -b1-3 input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "abc\n", stdout) +} + +func TestCutByteOpenEnd(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "abcdef\n") + stdout, _, code := cmdRun(t, "cut -b3- input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "cdef\n", stdout) +} + +func TestCutByteOpenStart(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "abcdef\n") + stdout, _, code := cmdRun(t, "cut -b-3 input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "abc\n", stdout) +} + +func TestCutByteBeyondLine(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "123\n") + stdout, _, code := cmdRun(t, "cut -c4 input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "\n", stdout) +} + +func TestCutByteEmptyInput(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "") + stdout, _, code := cmdRun(t, "cut -b1 input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "", stdout) +} + +// --- Character selection --- + +func TestCutCharBasic(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "abcd\n") + stdout, _, code := cmdRun(t, "cut -c2 input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "b\n", stdout) +} + +func TestCutCharMultibyte(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "\xce\xb1\xce\xb2\xce\xb3\n") // αβγ + // GNU cut treats -c as byte-wise (same as -b), so -c1 selects only the first byte. + stdout, _, code := cmdRun(t, "cut -c1 input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "\xce\n", stdout) +} + +// --- Delimiter --- + +func TestCutOutputDelimiter(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "a:b:c\n") + stdout, _, code := cmdRun(t, "cut -d: --output-delimiter=_ -f2,3 input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "b_c\n", stdout) +} + +func TestCutMulticharOutputDelim(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "a:b:c\n") + stdout, _, code := cmdRun(t, "cut -d: --output-delimiter=_._ -f2,3 input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "b_._c\n", stdout) +} + +func TestCutOutputDelimBytes(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "abcdefg\n") + stdout, _, code := cmdRun(t, "cut -c1-3,5- --output-delimiter=: input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "abc:efg\n", stdout) +} + +// --- Suppress (-s) --- + +func TestCutSuppressNoDelim(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "abc\n") + stdout, _, code := cmdRun(t, "cut -s -d: -f2,3 input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "", stdout) +} + +func TestCutSuppressWithDelim(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "a:b:c\n") + stdout, _, code := cmdRun(t, "cut -s -d: -f3- input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "c\n", stdout) +} + +// --- Complement --- + +func TestCutComplement(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "9_1\n8_2\n") + stdout, _, code := cmdRun(t, "cat input.txt | cut --complement -d_ -f2", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "9\n8\n", stdout) +} + +// --- Newline handling --- + +func TestCutNewlinePreserved(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "a\nb") + stdout, _, code := cmdRun(t, "cut -f1- input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "a\nb\n", stdout) +} + +func TestCutFieldNoTrailingNewline(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "a:1\nb:2") + stdout, _, code := cmdRun(t, "cut -d: -f1 input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "a\nb\n", stdout) +} + +// --- Errors --- + +func TestCutNoMode(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "a\n") + _, stderr, code := cmdRun(t, "cut input.txt", dir) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "cut:") +} + +func TestCutZeroPosition(t *testing.T) { + dir := t.TempDir() + _, stderr, code := cmdRun(t, "cut -b0 input.txt", dir) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "cut:") +} + +func TestCutDecreasingRange(t *testing.T) { + dir := t.TempDir() + _, stderr, code := cmdRun(t, "cut -f 2-0 input.txt", dir) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "cut:") +} + +func TestCutSuppressWithoutFields(t *testing.T) { + dir := t.TempDir() + _, stderr, code := cmdRun(t, "cut -s -b4 input.txt", dir) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "cut:") +} + +func TestCutDelimWithoutFields(t *testing.T) { + dir := t.TempDir() + _, stderr, code := cmdRun(t, "cut -d: -b1 input.txt", dir) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "cut:") +} + +func TestCutMissingFile(t *testing.T) { + dir := t.TempDir() + _, stderr, code := cmdRun(t, "cut -b1 nonexistent", dir) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "cut:") +} + +func TestCutMulticharDelim(t *testing.T) { + dir := t.TempDir() + _, stderr, code := cmdRun(t, "cut -d ab -f1 input.txt", dir) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "cut:") +} + +// --- Help --- + +func TestCutHelp(t *testing.T) { + dir := t.TempDir() + stdout, _, code := cmdRun(t, "cut --help", dir) + assert.Equal(t, 0, code) + assert.Contains(t, stdout, "Usage:") +} + +// --- Stdin --- + +func TestCutStdin(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "a:b:c\n") + stdout, _, code := cmdRun(t, "cat input.txt | cut -d: -f2", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "b\n", stdout) +} + +// --- Edge cases from GNU tests --- + +func TestCutEmptyFields(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", ":::\n") + stdout, _, code := cmdRun(t, "cut -d: -f1-3 input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "::\n", stdout) +} + +func TestCutOverlappingUnbounded(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "1234\n") + stdout, _, code := cmdRun(t, "cut -b3-,2- input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "234\n", stdout) +} + +func TestCutBigUnboundedRange(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "") + stdout, _, code := cmdRun(t, "cut -b1234567890- input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "", stdout) +} + +// --- Coverage: processBytesWithOutDelim --- + +func TestCutBytesWithOutputDelim(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "abcdefg\n") + // Non-contiguous byte ranges with output delimiter + stdout, _, code := cmdRun(t, "cut -b1-2,5- --output-delimiter=: input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "ab:efg\n", stdout) +} + +func TestCutBytesWithOutputDelimBeyondLine(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "abc\n") + // Range extends beyond line length + stdout, _, code := cmdRun(t, "cut -b1,5- --output-delimiter=: input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "a\n", stdout) +} + +// --- Coverage: processBytesComplementWithOutDelim --- + +func TestCutBytesComplementWithOutputDelim(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "abcdef\n") + stdout, _, code := cmdRun(t, "cut --complement -b3-4 --output-delimiter=: input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "ab:ef\n", stdout) +} + +// --- Coverage: complement bytes without output delim --- + +func TestCutBytesComplementNoOutDelim(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "abcdef\n") + stdout, _, code := cmdRun(t, "cut --complement -b3-4 input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "abef\n", stdout) +} + +// --- Coverage: -n flag (ignored, matching GNU coreutils) --- + +func TestCutBytesNFlagIsNoOp(t *testing.T) { + dir := t.TempDir() + // α is 2 bytes (0xCE 0xB1), β is 2 bytes (0xCE 0xB2), γ is 2 bytes (0xCE 0xB3) + writeFile(t, dir, "input.txt", "\xce\xb1\xce\xb2\xce\xb3\n") // αβγ + // GNU cut ignores -n: -b1 -n selects only the first byte (0xCE), not the full character + stdout, _, code := cmdRun(t, "cut -b1 -n input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "\xce\n", stdout) +} + +func TestCutBytesNFlagRangeIsNoOp(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "\xce\xb1\xce\xb2\xce\xb3\n") // αβγ + // GNU cut ignores -n: -b1-3 -n selects bytes 1,2,3 (0xCE 0xB1 0xCE) + stdout, _, code := cmdRun(t, "cut -b1-3 -n input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "\xce\xb1\xce\n", stdout) +} + +func TestCutBytesNFlagWithOutputDelim(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "\xce\xb1\xce\xb2\xce\xb3\n") // αβγ + // GNU cut ignores -n: -b1,5 -n selects bytes 1 and 5 (0xCE and 0xCE) + stdout, _, code := cmdRun(t, "cut -b1,5 -n --output-delimiter=: input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "\xce:\xce\n", stdout) +} + +func TestCutBytesNFlagComplement(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "\xce\xb1\xce\xb2\xce\xb3\n") // αβγ + // GNU cut ignores -n: --complement -b1 -n removes byte 1, keeps bytes 2-6 + stdout, _, code := cmdRun(t, "cut -b1 -n --complement input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "\xb1\xce\xb2\xce\xb3\n", stdout) +} + +// --- Coverage: CRLF line endings --- + +func TestCutCRLFLineEnding(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "a:b:c\r\n") + stdout, _, code := cmdRun(t, "cut -d: -f2 input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "b\n", stdout) +} + +func TestCutCRLFLineEndingLastField(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "a:b:c\r\n") + // GNU cut preserves \r as part of the last field content + stdout, _, code := cmdRun(t, "cut -d: -f3 input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "c\r\n", stdout) +} + +func TestCutCRLFByteMode(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "ab\r\n") + // GNU cut treats \r as byte 3 (regular content byte) + stdout, _, code := cmdRun(t, "cut -b3 input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "\r\n", stdout) +} + +// --- Coverage: decreasing range error --- + +func TestCutDecreasingRangeBytes(t *testing.T) { + dir := t.TempDir() + _, stderr, code := cmdRun(t, "cut -b 5-3 input.txt", dir) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "decreasing") +} + +// --- Coverage: multiple modes error --- + +func TestCutMultipleModes(t *testing.T) { + dir := t.TempDir() + _, stderr, code := cmdRun(t, "cut -b1 -f1 input.txt", dir) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "cut:") +} + +// --- Coverage: chars complement --- + +func TestCutCharsComplement(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "abcdef\n") + stdout, _, code := cmdRun(t, "cut --complement -c2,4 input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "acef\n", stdout) +} + +// --- Coverage: chars with output delimiter --- + +func TestCutCharsWithOutputDelim(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "abcdef\n") + stdout, _, code := cmdRun(t, "cut -c1-2,5- --output-delimiter=: input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "ab:ef\n", stdout) +} + +// --- Coverage: chars complement with output delimiter --- + +func TestCutCharsComplementWithOutputDelim(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "abcdef\n") + stdout, _, code := cmdRun(t, "cut --complement -c3-4 --output-delimiter=: input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "ab:ef\n", stdout) +} + +// --- Coverage: context cancellation --- + +func TestCutContextCancelled(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "a:b\n") + ctx, cancel := context.WithCancel(context.Background()) + cancel() + _, _, _ = runScriptCtx(ctx, t, "cut -d: -f1 input.txt", dir, interp.AllowedPaths([]string{dir})) +} + +// --- Coverage: stdin with no files (dash) --- + +func TestCutStdinDash(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "x:y:z\n") + stdout, _, code := cmdRun(t, "cat input.txt | cut -d: -f1 -", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "x\n", stdout) +} + +// --- Coverage: field complement with output delimiter --- + +func TestCutFieldComplementWithOutputDelim(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "a:b:c:d\n") + stdout, _, code := cmdRun(t, "cut -d: --complement --output-delimiter=_ -f2,3 input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "a_d\n", stdout) +} + +// --- Coverage: empty line in byte mode --- + +func TestCutBytesEmptyLine(t *testing.T) { + dir := t.TempDir() + writeFile(t, dir, "input.txt", "\n") + stdout, _, code := cmdRun(t, "cut -b1 input.txt", dir) + assert.Equal(t, 0, code) + assert.Equal(t, "\n", stdout) +} diff --git a/interp/register_builtins.go b/interp/register_builtins.go index 5356a203..772e6e40 100644 --- a/interp/register_builtins.go +++ b/interp/register_builtins.go @@ -11,6 +11,7 @@ import ( "github.com/DataDog/rshell/interp/builtins" breakcmd "github.com/DataDog/rshell/interp/builtins/break" "github.com/DataDog/rshell/interp/builtins/cat" + "github.com/DataDog/rshell/interp/builtins/cut" continuecmd "github.com/DataDog/rshell/interp/builtins/continue" "github.com/DataDog/rshell/interp/builtins/echo" "github.com/DataDog/rshell/interp/builtins/exit" @@ -31,6 +32,7 @@ func registerBuiltins() { for _, cmd := range []builtins.Command{ breakcmd.Cmd, cat.Cmd, + cut.Cmd, continuecmd.Cmd, echo.Cmd, exit.Cmd, diff --git a/tests/import_allowlist_test.go b/tests/import_allowlist_test.go index 8c69085a..862b0a98 100644 --- a/tests/import_allowlist_test.go +++ b/tests/import_allowlist_test.go @@ -70,6 +70,8 @@ var builtinAllowedSymbols = []string{ "io.ReadCloser", // io.Reader — interface type; no side effects. "io.Reader", + // math.MaxInt32 — integer constant; no side effects. + "math.MaxInt32", // math.MaxInt64 — integer constant; no side effects. "math.MaxInt64", // math.MinInt64 — integer constant; no side effects. @@ -96,6 +98,10 @@ var builtinAllowedSymbols = []string{ "strconv.FormatInt", // strings.HasPrefix — pure function for prefix matching; no I/O. "strings.HasPrefix", + // strings.IndexByte — finds byte in string; pure function, no I/O. + "strings.IndexByte", + // strings.Split — splits string by separator; pure function, no I/O. + "strings.Split", // strings.TrimSpace — removes leading/trailing whitespace; pure function. "strings.TrimSpace", // io.WriteString — writes a string to a writer; no filesystem access, delegates to Write. diff --git a/tests/scenarios/cmd/cut/bytes/byte_beyond_line.yaml b/tests/scenarios/cmd/cut/bytes/byte_beyond_line.yaml new file mode 100644 index 00000000..2225193b --- /dev/null +++ b/tests/scenarios/cmd/cut/bytes/byte_beyond_line.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl test 6 +description: Cut outputs empty line when byte position is beyond line length. +setup: + files: + - path: input.txt + content: "123\n" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -c4 input.txt +expect: + stdout: "\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/bytes/byte_beyond_no_newline.yaml b/tests/scenarios/cmd/cut/bytes/byte_beyond_no_newline.yaml new file mode 100644 index 00000000..718dca46 --- /dev/null +++ b/tests/scenarios/cmd/cut/bytes/byte_beyond_no_newline.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl test 7 +description: Cut outputs a newline even when input has no trailing newline and byte is beyond line. +setup: + files: + - path: input.txt + content: "123" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -c4 input.txt +expect: + stdout: "\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/bytes/byte_empty.yaml b/tests/scenarios/cmd/cut/bytes/byte_empty.yaml new file mode 100644 index 00000000..722d7752 --- /dev/null +++ b/tests/scenarios/cmd/cut/bytes/byte_empty.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl test 9 +description: Cut on empty input produces no output for byte selection. +setup: + files: + - path: input.txt + content: "" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -c4 input.txt +expect: + stdout: "" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/bytes/byte_empty_input.yaml b/tests/scenarios/cmd/cut/bytes/byte_empty_input.yaml new file mode 100644 index 00000000..15b0eb1e --- /dev/null +++ b/tests/scenarios/cmd/cut/bytes/byte_empty_input.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl test w +description: Cut -b1 on empty input produces no output. +setup: + files: + - path: input.txt + content: "" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -b1 input.txt +expect: + stdout: "" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/bytes/byte_on_multiline.yaml b/tests/scenarios/cmd/cut/bytes/byte_on_multiline.yaml new file mode 100644 index 00000000..5157d0f7 --- /dev/null +++ b/tests/scenarios/cmd/cut/bytes/byte_on_multiline.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl test 8 +description: Cut outputs empty lines for each line when byte position is beyond content. +setup: + files: + - path: input.txt + content: "123\n1" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -c4 input.txt +expect: + stdout: "\n\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/complement/complement_bytes.yaml b/tests/scenarios/cmd/cut/complement/complement_bytes.yaml new file mode 100644 index 00000000..afec1d13 --- /dev/null +++ b/tests/scenarios/cmd/cut/complement/complement_bytes.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl EOL-subsumed-3 test +description: Cut with --complement on bytes excludes the specified byte positions. +setup: + files: + - path: input.txt + content: "123456\n" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut --complement -b3,4-4,5,2- input.txt +expect: + stdout: "1\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/complement/complement_fields.yaml b/tests/scenarios/cmd/cut/complement/complement_fields.yaml new file mode 100644 index 00000000..1deb2b07 --- /dev/null +++ b/tests/scenarios/cmd/cut/complement/complement_fields.yaml @@ -0,0 +1,15 @@ +# Derived from uutils test_cut.rs::test_complement +description: Cut with --complement excludes the specified field. +setup: + files: + - path: input.txt + content: "9_1\n8_2\n7_3\n" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -d_ --complement -f2 input.txt +expect: + stdout: "9\n8\n7\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/delimiter/adjacent_byte_ranges.yaml b/tests/scenarios/cmd/cut/delimiter/adjacent_byte_ranges.yaml new file mode 100644 index 00000000..1efd3703 --- /dev/null +++ b/tests/scenarios/cmd/cut/delimiter/adjacent_byte_ranges.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl od-abut test +description: Cut inserts output delimiter between adjacent but separate byte ranges. +setup: + files: + - path: input.txt + content: "abcd\n" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -b1-2,3-4 --output-delimiter=: input.txt +expect: + stdout: "ab:cd\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/delimiter/multichar_output_delim.yaml b/tests/scenarios/cmd/cut/delimiter/multichar_output_delim.yaml new file mode 100644 index 00000000..94c05bce --- /dev/null +++ b/tests/scenarios/cmd/cut/delimiter/multichar_output_delim.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl multichar-od test +description: Cut supports multi-character output delimiters. +setup: + files: + - path: input.txt + content: "a:b:c\n" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -d: --output-delimiter=_._ -f2,3 input.txt +expect: + stdout: "b_._c\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/delimiter/output_delim_bytes.yaml b/tests/scenarios/cmd/cut/delimiter/output_delim_bytes.yaml new file mode 100644 index 00000000..25d573c3 --- /dev/null +++ b/tests/scenarios/cmd/cut/delimiter/output_delim_bytes.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl out-delim1 test +description: Cut inserts output delimiter between non-adjacent byte ranges. +setup: + files: + - path: input.txt + content: "abcdefg\n" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -c1-3,5- --output-delimiter=: input.txt +expect: + stdout: "abc:efg\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/delimiter/output_delim_overlap.yaml b/tests/scenarios/cmd/cut/delimiter/output_delim_overlap.yaml new file mode 100644 index 00000000..f805a775 --- /dev/null +++ b/tests/scenarios/cmd/cut/delimiter/output_delim_overlap.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl out-delim2 test +description: Cut merges overlapping byte ranges and inserts output delimiter at gaps. +setup: + files: + - path: input.txt + content: "abcdefg\n" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -c1-3,2,5- --output-delimiter=: input.txt +expect: + stdout: "abc:efg\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/delimiter/output_delim_partial.yaml b/tests/scenarios/cmd/cut/delimiter/output_delim_partial.yaml new file mode 100644 index 00000000..c7a09ea6 --- /dev/null +++ b/tests/scenarios/cmd/cut/delimiter/output_delim_partial.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl out-delim3 test +description: Cut merges overlapping byte ranges and separates disjoint ones with output delimiter. +setup: + files: + - path: input.txt + content: "abcdefg\n" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -c1-3,2-4,6 --output-delimiter=: input.txt +expect: + stdout: "abcd:f\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/delimiter/output_delimiter.yaml b/tests/scenarios/cmd/cut/delimiter/output_delimiter.yaml new file mode 100644 index 00000000..f09cf623 --- /dev/null +++ b/tests/scenarios/cmd/cut/delimiter/output_delimiter.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl o-delim test +description: Cut replaces the input delimiter with the output delimiter between fields. +setup: + files: + - path: input.txt + content: "a:b:c\n" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -d: --output-delimiter=_ -f2,3 input.txt +expect: + stdout: "b_c\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/errors/bare_dash.yaml b/tests/scenarios/cmd/cut/errors/bare_dash.yaml new file mode 100644 index 00000000..d8ab3279 --- /dev/null +++ b/tests/scenarios/cmd/cut/errors/bare_dash.yaml @@ -0,0 +1,9 @@ +# Derived from GNU coreutils cut.pl inval2 test +description: Cut rejects a bare dash as a field specification. +input: + script: |+ + cut -f - +expect: + stdout: "" + stderr_contains: ["cut:"] + exit_code: 1 diff --git a/tests/scenarios/cmd/cut/errors/decreasing_range.yaml b/tests/scenarios/cmd/cut/errors/decreasing_range.yaml new file mode 100644 index 00000000..57508805 --- /dev/null +++ b/tests/scenarios/cmd/cut/errors/decreasing_range.yaml @@ -0,0 +1,9 @@ +# Derived from GNU coreutils cut.pl inval1 test +description: Cut rejects a decreasing range specification. +input: + script: |+ + cut -f 2-0 +expect: + stdout: "" + stderr_contains: ["cut:"] + exit_code: 1 diff --git a/tests/scenarios/cmd/cut/errors/delim_multichar.yaml b/tests/scenarios/cmd/cut/errors/delim_multichar.yaml new file mode 100644 index 00000000..36c00298 --- /dev/null +++ b/tests/scenarios/cmd/cut/errors/delim_multichar.yaml @@ -0,0 +1,9 @@ +# Derived from uutils test_cut.rs::test_delimiter_must_be_single_char +description: Cut rejects a multi-character delimiter. +input: + script: |+ + cut -d ab -f1 +expect: + stdout: "" + stderr_contains: ["cut:"] + exit_code: 1 diff --git a/tests/scenarios/cmd/cut/errors/delim_without_fields.yaml b/tests/scenarios/cmd/cut/errors/delim_without_fields.yaml new file mode 100644 index 00000000..60326788 --- /dev/null +++ b/tests/scenarios/cmd/cut/errors/delim_without_fields.yaml @@ -0,0 +1,9 @@ +# Derived from GNU coreutils cut.pl delim-no-field1 test +description: Cut rejects -d delimiter option when used with byte mode. +input: + script: |+ + cut -d: -b1 +expect: + stdout: "" + stderr_contains: ["cut:"] + exit_code: 1 diff --git a/tests/scenarios/cmd/cut/errors/missing_file.yaml b/tests/scenarios/cmd/cut/errors/missing_file.yaml new file mode 100644 index 00000000..2544af11 --- /dev/null +++ b/tests/scenarios/cmd/cut/errors/missing_file.yaml @@ -0,0 +1,10 @@ +# Derived from GNU coreutils cut error handling +description: Cut reports error when input file does not exist. +input: + allowed_paths: ["$DIR"] + script: |+ + cut -b1 nonexistent +expect: + stdout: "" + stderr_contains: ["cut:"] + exit_code: 1 diff --git a/tests/scenarios/cmd/cut/errors/no_mode.yaml b/tests/scenarios/cmd/cut/errors/no_mode.yaml new file mode 100644 index 00000000..3fc50c44 --- /dev/null +++ b/tests/scenarios/cmd/cut/errors/no_mode.yaml @@ -0,0 +1,9 @@ +# Derived from GNU coreutils cut.pl error handling +description: Cut with no arguments exits with error. +input: + script: |+ + cut +expect: + stdout: "" + stderr_contains: ["cut:"] + exit_code: 1 diff --git a/tests/scenarios/cmd/cut/errors/suppress_without_fields.yaml b/tests/scenarios/cmd/cut/errors/suppress_without_fields.yaml new file mode 100644 index 00000000..a64e529d --- /dev/null +++ b/tests/scenarios/cmd/cut/errors/suppress_without_fields.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl test y +description: Cut rejects -s flag when used with byte mode instead of field mode. +setup: + files: + - path: input.txt + content: ":\n" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -s -b4 input.txt +expect: + stdout: "" + stderr_contains: ["cut:"] + exit_code: 1 diff --git a/tests/scenarios/cmd/cut/errors/zero_field.yaml b/tests/scenarios/cmd/cut/errors/zero_field.yaml new file mode 100644 index 00000000..ec42d0a3 --- /dev/null +++ b/tests/scenarios/cmd/cut/errors/zero_field.yaml @@ -0,0 +1,9 @@ +# Derived from GNU coreutils cut.pl zero-2 test +description: Cut rejects field position zero in a range. +input: + script: |+ + cut -f0-2 +expect: + stdout: "" + stderr_contains: ["cut:"] + exit_code: 1 diff --git a/tests/scenarios/cmd/cut/errors/zero_position.yaml b/tests/scenarios/cmd/cut/errors/zero_position.yaml new file mode 100644 index 00000000..b26f4a92 --- /dev/null +++ b/tests/scenarios/cmd/cut/errors/zero_position.yaml @@ -0,0 +1,9 @@ +# Derived from GNU coreutils cut.pl zero-1 test +description: Cut rejects byte position zero. +input: + script: |+ + cut -b0 +expect: + stdout: "" + stderr_contains: ["cut:"] + exit_code: 1 diff --git a/tests/scenarios/cmd/cut/fields/basic_field_select.yaml b/tests/scenarios/cmd/cut/fields/basic_field_select.yaml new file mode 100644 index 00000000..5d718973 --- /dev/null +++ b/tests/scenarios/cmd/cut/fields/basic_field_select.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl test 1 +description: Cut selects specific fields and field ranges with a delimiter. +setup: + files: + - path: input.txt + content: "a:b:c\n" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -d: -f1,3- input.txt +expect: + stdout: "a:c\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/fields/empty_fields.yaml b/tests/scenarios/cmd/cut/fields/empty_fields.yaml new file mode 100644 index 00000000..ffcd4a2d --- /dev/null +++ b/tests/scenarios/cmd/cut/fields/empty_fields.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl test i +description: Cut handles lines with consecutive delimiters producing empty fields. +setup: + files: + - path: input.txt + content: ":::\n" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -d: -f1-3 input.txt +expect: + stdout: "::\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/fields/empty_input_field.yaml b/tests/scenarios/cmd/cut/fields/empty_input_field.yaml new file mode 100644 index 00000000..2c58609b --- /dev/null +++ b/tests/scenarios/cmd/cut/fields/empty_input_field.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl test 5 +description: Cut on empty input produces no output. +setup: + files: + - path: input.txt + content: "" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -d: -f4 input.txt +expect: + stdout: "" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/fields/field_beyond_end.yaml b/tests/scenarios/cmd/cut/fields/field_beyond_end.yaml new file mode 100644 index 00000000..9bb55cf6 --- /dev/null +++ b/tests/scenarios/cmd/cut/fields/field_beyond_end.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl test 4 +description: Cut outputs empty line when requested field is beyond the number of fields. +setup: + files: + - path: input.txt + content: "a:b:c\n" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -d: -f4 input.txt +expect: + stdout: "\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/fields/field_range.yaml b/tests/scenarios/cmd/cut/fields/field_range.yaml new file mode 100644 index 00000000..14b3603a --- /dev/null +++ b/tests/scenarios/cmd/cut/fields/field_range.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl test 3 +description: Cut selects fields from a given position to end of line. +setup: + files: + - path: input.txt + content: "a:b:c\n" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -d: -f2- input.txt +expect: + stdout: "b:c\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/fields/suppress_no_delim.yaml b/tests/scenarios/cmd/cut/fields/suppress_no_delim.yaml new file mode 100644 index 00000000..cfe52f52 --- /dev/null +++ b/tests/scenarios/cmd/cut/fields/suppress_no_delim.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl test a +description: Cut with -s suppresses lines without the delimiter and selects field range. +setup: + files: + - path: input.txt + content: "a:b:c\n" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -s -d: -f3- input.txt +expect: + stdout: "c\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/fields/suppress_select_fields.yaml b/tests/scenarios/cmd/cut/fields/suppress_select_fields.yaml new file mode 100644 index 00000000..1941b9af --- /dev/null +++ b/tests/scenarios/cmd/cut/fields/suppress_select_fields.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl test b +description: Cut with -s selects multiple fields from a delimited line. +setup: + files: + - path: input.txt + content: "a:b:c\n" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -s -d: -f2,3 input.txt +expect: + stdout: "b:c\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/fields/suppress_skips_no_delim.yaml b/tests/scenarios/cmd/cut/fields/suppress_skips_no_delim.yaml new file mode 100644 index 00000000..8f89545e --- /dev/null +++ b/tests/scenarios/cmd/cut/fields/suppress_skips_no_delim.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl test h +description: Cut with -s skips lines that do not contain the delimiter. +setup: + files: + - path: input.txt + content: "abc\n" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -s -d: -f2,3 input.txt +expect: + stdout: "" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/fields/trailing_delim.yaml b/tests/scenarios/cmd/cut/fields/trailing_delim.yaml new file mode 100644 index 00000000..43703984 --- /dev/null +++ b/tests/scenarios/cmd/cut/fields/trailing_delim.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl test d +description: Cut with -s selects non-adjacent fields from a line with trailing delimiter. +setup: + files: + - path: input.txt + content: "a:b:c:\n" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -s -d: -f1,3 input.txt +expect: + stdout: "a:c\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/fields/trailing_delim_range.yaml b/tests/scenarios/cmd/cut/fields/trailing_delim_range.yaml new file mode 100644 index 00000000..820083ec --- /dev/null +++ b/tests/scenarios/cmd/cut/fields/trailing_delim_range.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl test e +description: Cut with -s selects field range from a line with trailing delimiter. +setup: + files: + - path: input.txt + content: "a:b:c:\n" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -s -d: -f3- input.txt +expect: + stdout: "c:\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/newlines/newline_field2.yaml b/tests/scenarios/cmd/cut/newlines/newline_field2.yaml new file mode 100644 index 00000000..e2e5932e --- /dev/null +++ b/tests/scenarios/cmd/cut/newlines/newline_field2.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl newline-5 test +description: Cut extracts second field from each line of a multiline delimited file. +setup: + files: + - path: input.txt + content: "a:1\nb:2\n" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -d: -f2 input.txt +expect: + stdout: "1\n2\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/newlines/newline_field2_no_trailing.yaml b/tests/scenarios/cmd/cut/newlines/newline_field2_no_trailing.yaml new file mode 100644 index 00000000..0769ae90 --- /dev/null +++ b/tests/scenarios/cmd/cut/newlines/newline_field2_no_trailing.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl newline-6 test +description: Cut extracts second field and adds newline when input lacks trailing newline. +setup: + files: + - path: input.txt + content: "a:1\nb:2" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -d: -f2 input.txt +expect: + stdout: "1\n2\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/newlines/newline_field_all.yaml b/tests/scenarios/cmd/cut/newlines/newline_field_all.yaml new file mode 100644 index 00000000..d1874fbb --- /dev/null +++ b/tests/scenarios/cmd/cut/newlines/newline_field_all.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl newline-1 test +description: Cut selecting all fields preserves content and adds trailing newline. +setup: + files: + - path: input.txt + content: "a\nb" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -f1- input.txt +expect: + stdout: "a\nb\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/newlines/newline_field_delim.yaml b/tests/scenarios/cmd/cut/newlines/newline_field_delim.yaml new file mode 100644 index 00000000..a5cdcfbb --- /dev/null +++ b/tests/scenarios/cmd/cut/newlines/newline_field_delim.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl newline-3 test +description: Cut extracts first field from each line of a multiline delimited file. +setup: + files: + - path: input.txt + content: "a:1\nb:2\n" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -d: -f1 input.txt +expect: + stdout: "a\nb\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/newlines/newline_no_trailing.yaml b/tests/scenarios/cmd/cut/newlines/newline_no_trailing.yaml new file mode 100644 index 00000000..399457e7 --- /dev/null +++ b/tests/scenarios/cmd/cut/newlines/newline_no_trailing.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl newline-4 test +description: Cut adds trailing newline to last line even when input lacks one. +setup: + files: + - path: input.txt + content: "a:1\nb:2" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -d: -f1 input.txt +expect: + stdout: "a\nb\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/ranges/big_unbounded.yaml b/tests/scenarios/cmd/cut/ranges/big_unbounded.yaml new file mode 100644 index 00000000..4ba566b1 --- /dev/null +++ b/tests/scenarios/cmd/cut/ranges/big_unbounded.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl big-unbounded-b test +description: Cut handles very large start position in unbounded byte range on empty input. +setup: + files: + - path: input.txt + content: "" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut --output-delimiter=: -b1234567890- input.txt +expect: + stdout: "" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/ranges/overlapping_unbounded.yaml b/tests/scenarios/cmd/cut/ranges/overlapping_unbounded.yaml new file mode 100644 index 00000000..0878726e --- /dev/null +++ b/tests/scenarios/cmd/cut/ranges/overlapping_unbounded.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl overlapping-unbounded-1 test +description: Cut merges overlapping unbounded byte ranges correctly. +setup: + files: + - path: input.txt + content: "1234\n" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut -b3-,2- input.txt +expect: + stdout: "234\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/ranges/subsumed_eol.yaml b/tests/scenarios/cmd/cut/ranges/subsumed_eol.yaml new file mode 100644 index 00000000..edb142de --- /dev/null +++ b/tests/scenarios/cmd/cut/ranges/subsumed_eol.yaml @@ -0,0 +1,15 @@ +# Derived from GNU coreutils cut.pl EOL-subsumed-1 test +description: Cut merges ranges where later ranges are subsumed by an unbounded range. +setup: + files: + - path: input.txt + content: "123456\n" + chmod: 0644 +input: + allowed_paths: ["$DIR"] + script: |+ + cut --output-delimiter=: -b2-,3,4-4,5 input.txt +expect: + stdout: "23456\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/stdin/stdin_dash.yaml b/tests/scenarios/cmd/cut/stdin/stdin_dash.yaml new file mode 100644 index 00000000..ac6f86e7 --- /dev/null +++ b/tests/scenarios/cmd/cut/stdin/stdin_dash.yaml @@ -0,0 +1,9 @@ +# Derived from standard cut stdin usage with explicit dash +description: Cut reads from stdin when dash is given as the file argument. +input: + script: |+ + echo "a:b:c" | cut -d: -f1 - +expect: + stdout: "a\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/cut/stdin/stdin_pipe.yaml b/tests/scenarios/cmd/cut/stdin/stdin_pipe.yaml new file mode 100644 index 00000000..4f0e819c --- /dev/null +++ b/tests/scenarios/cmd/cut/stdin/stdin_pipe.yaml @@ -0,0 +1,9 @@ +# Derived from standard cut stdin usage +description: Cut reads from stdin when input is piped. +input: + script: |+ + echo "a:b:c" | cut -d: -f2 +expect: + stdout: "b\n" + stderr: "" + exit_code: 0