Implement uniq builtin command by AlexandreYang · Pull Request #20 · DataDog/rshell

AlexandreYang · 2026-03-10T10:30:39Z

What does this PR do?

Implements the POSIX uniq command as a builtin in the safe shell interpreter. The command filters adjacent matching lines from input, supporting all common GNU coreutils flags while maintaining the shell's safety guarantees.

Supported flags: -c/--count, -d/--repeated, -u/--unique, -i/--ignore-case, -f/--skip-fields, -s/--skip-chars, -w/--check-chars, -z/--zero-terminated, -D/--all-repeated[=METHOD], --group[=METHOD], -h/--help.

Intentionally not supported: Output file (2nd positional arg) — writes to filesystem, violating RULES.md.

Motivation

Adds a commonly-used text filtering utility to the shell's builtin command set, enabling deduplication of sorted data in pipelines.

Testing

90+ Go unit tests covering all flags, edge cases, error paths, and GNU compatibility
25 YAML scenario tests for shell-level integration
Dedicated pentest test file covering integer overflow, long lines, path traversal, context cancellation, and binary content
Import allowlist test passes (new symbols added with safety justifications)
All tests pass: go test ./interp/builtins/uniq/... ./tests/...

Checklist

Tests added/updated
Documentation updated (if applicable)

PR by Bits
View session in Datadog

Comment @DataDog to request changes

Co-authored-by: AlexandreYang <49917914+AlexandreYang@users.noreply.github.com>

datadog-prod-us1-3 · 2026-03-10T10:30:40Z

View session in Datadog

Bits Dev status: ✅ Done

CI Auto-fix: Disabled | Enable

Comment @DataDog to request changes

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6f830c1dbe

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

GNU coreutils uniq appends "Try 'uniq --help' for more information." after mutually exclusive flag errors. Match that behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…read/write errors Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

This reverts commit 57a0599.

AlexandreYang · 2026-03-10T11:24:30Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 863ddb2b61

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…format - parseNonNegativeInt now rejects negative overflows (e.g. -999999999999999999999) instead of clamping them to MaxCount - Error messages for -f/-s/-w now match GNU uniq format: "uniq: VALUE: message" Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

AlexandreYang · 2026-03-10T12:10:38Z

https://github.com/codex review

AlexandreYang · 2026-03-10T12:10:44Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cd4917f219

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

matt-dz

Security Audit Report

Overall Risk Assessment: LOW
Findings: 0 Critical, 0 High, 2 Medium, 2 Low, 3 Informational

Solid implementation — no sandbox escape vectors, no write capabilities, no paths to arbitrary file access. The security posture is excellent for a restricted shell builtin.

#	Severity	Title
S1	Medium	`int64`→`int` truncation in `compareKey` is fragile
S2	Medium	`lineCount` has no overflow guard (theoretical)
S3	Low	`skipFieldsN` has O(n) empty iterations with large `-f`
S4	Low	Prefix matching in method parsers is order-dependent
S5	Info	Scanner error may expose Go internals
S6	Info+	Output file correctly omitted — sandbox preserved
S7	Info+	Import allowlist additions all safe

Additional Notes

Documentation: README.md and SHELL_FEATURES.md should be updated per AGENTS.md requirements.

Test Gaps:

No test for scanner read error propagation (e.g., failing reader mid-stream)
No test for write error propagation (failing writer → "uniq: write error")

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-authored-by: AlexandreYang <49917914+AlexandreYang@users.noreply.github.com>

…device names Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…/SKILL.md

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…bash compatibility Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… optimization, docs - Fix -i/--ignore-case to use ASCII-only folding (A-Z → a-z) matching GNU uniq behavior in C/POSIX locale. Non-ASCII characters like Ä/ä are no longer incorrectly collapsed. (P1 - chatgpt-codex-connector) - Use int64 arithmetic throughout compareKey() to avoid int64→int truncation on 32-bit platforms. (S1 - matt-dz) - Add bounds check to skipFieldsN loop condition to eliminate O(n) empty iterations with large -f values. (S3 - matt-dz) - Document uniq builtin in SHELL_FEATURES.md and README.md. (P2 - chatgpt-codex-connector) - Remove strings.ToLower from import allowlist (no longer used). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

matt-dz

Security Audit — `uniq` Builtin

Overall Risk: LOW — Recommend merge.

The implementation is well-crafted, follows established safety patterns, and has thorough test coverage including dedicated penetration tests.

Severity	Finding	Location
MEDIUM	Theoretical `lineCount` overflow on unbounded streams	`uniq.go:304`
LOW	Prefix matching ambiguity risk in option parsers	`uniq.go:498-526`
INFO	Scanner error not wrapped with `PortableErr`	`uniq.go:385-388`

Checklist — All PASS

✅ Sandbox bypass — no direct os.Open/os.ReadFile/os.Stat calls
✅ Filesystem access — exclusively via callCtx.OpenFile(ctx, file, os.O_RDONLY, 0)
✅ No write path — output file (2nd positional arg) intentionally rejected
✅ Memory bounded — scanner capped at 1 MiB; only prev+current line in memory
✅ Integer overflow — parseNonNegativeInt handles all edge cases, clamps to MaxCount
✅ Context cancellation — ctx.Err() checked every loop iteration
✅ Import allowlist — all 7 new symbols are safe types/pure functions
✅ Custom SplitFunc — simple byte-delimiter scan, no exploitation vector
✅ Error sanitization — file-open errors use callCtx.PortableErr()

Resolve conflicts in register_builtins.go and import_allowlist_test.go by keeping both uniq and wc additions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The builtins.Command struct was updated on main to use MakeFlags instead of Run. Refactor the uniq command to register flags via the new factory pattern, matching wc and other builtins. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…tableErr - Add saturating increment for lineCount to prevent theoretical int64 overflow (S2 comment from matt-dz) - Add comments documenting deliberate first-match-wins prefix ordering in parseAllRepeatedMethod and parseGroupMethod (S3 comment) - Wrap scanner error with callCtx.PortableErr for consistency with other error paths (S4 comment) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Move skipFieldsN doc comment to its own function (was incorrectly attached to asciiToLower) - Optimize asciiToLower to avoid allocation when input has no uppercase ASCII letters Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Resolve conflicts in README.md and SHELL_FEATURES.md by combining both sets of changes (new builtins from main + uniq from this branch). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

# Conflicts: # tests/import_allowlist_test.go

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

matt-dz

Security Review Summary

#	Priority	File	Finding
—	—	—	No security issues found

Scope: Reviewed 6 changed source files (implementation, tests, registration, import allowlist) from PR #20.

Verdict: Clean audit — this is a well-hardened implementation with no security concerns.

Audit Details

Sandboxed file access — File I/O uses callCtx.OpenFile() which enforces the shell's path restrictions. Direct os.Open is not used. The import allowlist test (tests/import_allowlist_test.go) enforces symbol-level restrictions at build time.

No filesystem writes — The GNU uniq output-file argument (second positional arg) is intentionally omitted, preventing filesystem write operations. Extra operands are rejected with exit code 1.

Memory safety — Lines are streamed one at a time with a 1 MiB per-line cap (MaxLineBytes). Only the current and previous lines are held in memory. The scan loop checks ctx.Err() on every iteration to honour execution timeouts.

Integer overflow handling — parseNonNegativeInt correctly handles:

Negative values (rejected)
Negative overflow (e.g. -999999999999999999999, rejected)
Positive overflow (clamped to MaxCount)
lineCount capped at math.MaxInt64

Slice bounds — skipChars is clamped to len(s) before slicing (line 435–436). checkChars is bounds-checked against string length (line 440). skipFieldsN iteration is bounded by both n and len(s).

Import allowlist compliance — Only uses pre-approved stdlib symbols: bufio, context, io, math, os.O_RDONLY, strconv, strings. No reflect, unsafe, os/exec, or network packages.

Pentest coverage — The dedicated uniq_pentest_test.go covers integer edge cases, overflow, long lines at/beyond the cap, path traversal outside allowed paths, flag injection, context cancellation, and binary/null-byte content. Excellent defensive test coverage.

Reviewed by security-auditor

AlexandreYang · 2026-03-10T19:08:06Z

/merge

gh-worker-devflow-routing-ef8351 · 2026-03-10T19:08:10Z

View all feedbacks in Devflow UI.

2026-03-10 19:08:09 UTC ℹ️ Start processing command /merge

2026-03-10 19:08:14 UTC ℹ️ MergeQueue: pull request added to the queue

The expected merge time in main is approximately 45s (p90).

2026-03-10 19:08:51 UTC ℹ️ MergeQueue: This merge request was merged

Implement uniq builtin command

6f830c1

Co-authored-by: AlexandreYang <49917914+AlexandreYang@users.noreply.github.com>

datadog-datadog-prod-us1 Bot added the Bits AI label Mar 10, 2026

AlexandreYang marked this pull request as ready for review March 10, 2026 10:32

AlexandreYang requested review from matt-dz and thieman as code owners March 10, 2026 10:32

chatgpt-codex-connector Bot reviewed Mar 10, 2026

View reviewed changes

Comment thread interp/builtins/uniq/uniq.go Outdated

Comment thread interp/builtins/uniq/uniq.go Outdated

Comment thread interp/builtins/uniq/uniq.go Outdated

Comment thread interp/builtins/uniq/uniq.go Outdated

AlexandreYang and others added 6 commits March 10, 2026 11:50

Fix uniq error messages to match bash by adding Try --help hint

cd62e37

GNU coreutils uniq appends "Try 'uniq --help' for more information." after mutually exclusive flag errors. Match that behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

add .claude/skills/fix-tests/SKILL.md

57a0599

Fix uniq: reject empty method args, honor --unique with -D, separate …

14e76f2

…read/write errors Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Revert "add .claude/skills/fix-tests/SKILL.md"

6e47773

This reverts commit 57a0599.

Merge branch 'main' into dd/yEwX4dU9gJEU

86baa3a

Merge branch 'main' into dd/yEwX4dU9gJEU

863ddb2

chatgpt-codex-connector Bot reviewed Mar 10, 2026

View reviewed changes

Comment thread interp/builtins/uniq/uniq.go

chatgpt-codex-connector Bot reviewed Mar 10, 2026

View reviewed changes

Comment thread interp/builtins/uniq/uniq.go Outdated

Comment thread interp/register_builtins.go

matt-dz reviewed Mar 10, 2026

View reviewed changes

AlexandreYang and others added 8 commits March 10, 2026 16:54

Add ci-code-review skill for diagnosing and fixing CI failures on PRs

7ab5d5d

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

remove resources

ee24297

Re-implement cat builtin with full flag support

d281b1d

Co-authored-by: AlexandreYang <49917914+AlexandreYang@users.noreply.github.com>

Fix TestCatPentestDevNull on Windows: assert sandbox blocks reserved …

af11309

…device names Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

.claude/skills/ci-code-review/SKILL.md -> .claude/skills/fix-ci-tests…

7d615bd

…/SKILL.md

Add review-comments skill for addressing PR review comments

c64efd0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add critical directive: prioritise shell fixes over test changes for …

0191007

…bash compatibility Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

AlexandreYang requested a review from matt-dz March 10, 2026 16:13

Merge branch 'main' into dd/yEwX4dU9gJEU

40b551c

matt-dz reviewed Mar 10, 2026

View reviewed changes

Comment thread interp/builtins/uniq/uniq.go

Comment thread interp/builtins/uniq/uniq.go

Comment thread interp/builtins/uniq/uniq.go

AlexandreYang and others added 6 commits March 10, 2026 18:41

Merge branch 'main' into dd/yEwX4dU9gJEU

4401a17

Resolve conflicts in register_builtins.go and import_allowlist_test.go by keeping both uniq and wc additions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fix uniq builtin to use MakeFlags API

58952ed

The builtins.Command struct was updated on main to use MakeFlags instead of Run. Refactor the uniq command to register flags via the new factory pattern, matching wc and other builtins. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge branch 'main' into dd/yEwX4dU9gJEU

8f3c46a

Resolve conflicts in README.md and SHELL_FEATURES.md by combining both sets of changes (new builtins from main + uniq from this branch). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove unused io.Copy from import allowlist

bc2229e

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

AlexandreYang requested a review from matt-dz March 10, 2026 18:52

AlexandreYang and others added 2 commits March 10, 2026 19:55

Merge remote-tracking branch 'origin/main' into dd/yEwX4dU9gJEU

d4303ac

# Conflicts: # tests/import_allowlist_test.go

Remove unused io.Copy from import allowlist

e103a79

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

matt-dz approved these changes Mar 10, 2026

View reviewed changes

thieman approved these changes Mar 10, 2026

View reviewed changes

gh-worker-dd-devflow-36fce6 Bot added mergequeue-status: queued mergequeue-status: in_progress and removed mergequeue-status: queued labels Mar 10, 2026

gh-worker-dd-mergequeue-cf854d Bot merged commit 93e6f7c into main Mar 10, 2026
9 checks passed

gh-worker-dd-devflow-36fce6 Bot removed the mergequeue-status: in_progress label Mar 10, 2026

gh-worker-dd-mergequeue-cf854d Bot deleted the dd/yEwX4dU9gJEU branch March 10, 2026 19:08

gh-worker-dd-devflow-36fce6 Bot added the mergequeue-status: done label Mar 10, 2026

AlexandreYang mentioned this pull request Mar 12, 2026

Implement printf builtin command #57

Merged

4 tasks

Conversation

AlexandreYang commented Mar 10, 2026

What does this PR do?

Motivation

Testing

Checklist

Uh oh!

datadog-prod-us1-3 Bot commented Mar 10, 2026 • edited by datadog-official Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

AlexandreYang commented Mar 10, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

AlexandreYang commented Mar 10, 2026

Uh oh!

AlexandreYang commented Mar 10, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

matt-dz left a comment

Choose a reason for hiding this comment

Security Audit Report

Additional Notes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

matt-dz left a comment

Choose a reason for hiding this comment

Security Audit — uniq Builtin

Checklist — All PASS

Uh oh!

Uh oh!

Uh oh!

Uh oh!

matt-dz left a comment

Choose a reason for hiding this comment

Security Review Summary

Audit Details

Uh oh!

AlexandreYang commented Mar 10, 2026

Uh oh!

gh-worker-devflow-routing-ef8351 Bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

datadog-prod-us1-3 Bot commented Mar 10, 2026 •

edited by datadog-official Bot

Loading

Security Audit — `uniq` Builtin

gh-worker-devflow-routing-ef8351 Bot commented Mar 10, 2026 •

edited

Loading