feat(df): add df builtin for disk space usage reporting#205
feat(df): add df builtin for disk space usage reporting#205
Conversation
Adds df as a sandboxed builtin (Linux + macOS; Windows returns "not supported"). Mount enumeration goes through a new internal package, builtins/internal/diskstats, which reads /proc/self/mountinfo on Linux (documented sandbox-bypass mirroring ss/ip route — the path is hardcoded and never derived from user input) and calls getfsstat(2) on macOS. Supported flags: -h, -H, -k, -P, -T, -i, -a, -l, -t TYPE (repeatable), -x TYPE (repeatable), --total, --no-sync, --help. The dangerous --sync flag (which would invoke sync(2) and mutate kernel buffer state) is unregistered and rejected by pflag as unknown. -B, --output, and positional FILE operands are deferred to a future version. Memory bounds: mount table capped at 100k entries, mountinfo line length capped at 1 MiB, scan total capped at 1M lines. Integer arithmetic uses paired right-shifts in percentUsed and saturating addition in totals to avoid overflow on extreme magnitudes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6b2ec9136f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
CI fixes: - Fuzz: dfRunFuzz no longer routes through testutil.RunScriptCtx, which fatals on shell-parse errors. The fuzzer routinely mutates inputs into malformed shell syntax (unclosed quotes, etc.); we now treat parse failures as expected and skip the iteration. - Windows: pentest tests that exercised df's actual code path (TestDfPentestVeryLongTypeName, TestDfPentestManyTypeFilters, TestDfPentestTypeFilterEdgeValues, TestDfPentestNonUTF8FlagValue, TestDfPentestUnicodeNFD, TestDfPentestQuotedValues) now skip via requireSupported — df returns "not supported" on Windows, which made the asserted code==0 invariant unreachable. Codex feedback: - P1: Remove "overlay" from the Linux pseudo-FS table. Container hosts use overlay as the default root filesystem; classifying it as pseudo hid the real root from the default listing. - P2: An explicit -t TYPE filter now overrides the default pseudo-FS suppression. `df -t tmpfs` lists tmpfs mounts without requiring -a, matching GNU df. - P2: humanBytes rounds up via math.Ceil instead of fmt.Sprintf's round-to-nearest, matching GNU df's "never under-report" rule. Example: 1,576,960 bytes is now "1.6M" (was "1.5M"). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Windows test failures: TestDfPentestNonUTF8FlagValue, TestDfPentestUnicodeNFD,
and TestDfPentestQuotedValues were missing requireSupported(t) so they ran on
Windows where df returns "not supported" (code 1), making the asserted code==0
unreachable. Add the skip alongside the seven other Windows-skipped tests.
Fuzz contract: the previous version asserted exit codes in {0, 1, 127} only,
which broke on legitimate runner behaviour like "df 0&" → code 2 (background
job). The fuzz target's real contract is "no panic and no hang inside df";
both are enforced by Go's testing framework and the helper's 5-second timeout
respectively. Drop the exit-code assertion entirely. Also stop fataling on
non-ExitStatus runner errors (glob expansion failures, "internal error" on
adversarial inputs) — those are runner behaviour, not df defects. Verified by
running the fuzzer locally for 20s (670k execs) without failure.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
13 scenario tests under tests/scenarios/cmd/df/ exercised df's actual code path and asserted exit code 0 with output rows. On Windows, df returns "not supported" with exit 1. The scenario framework has no platform-skip mechanism (per AGENTS.md note about ip route), so happy- path scenarios cannot be added for platform-restricted commands. The deleted scenarios were redundant with builtins/df/df_unix_test.go, which uses //go:build unix to exercise the same flag wiring with structural assertions on the live mount table — a richer check than the stdout_contains substring match the YAML scenarios provided. Retained 6 scenarios that are platform-agnostic (they short-circuit before diskstats.List): --help, extra operand, unknown flag, --sync rejection, -B/--block-size rejection, --output rejection. Deleted: basic/default_succeeds.yaml, flags/all.yaml, exclude_type.yaml, human_readable.yaml, inodes.yaml, k_is_default.yaml, local.yaml, no_sync.yaml, posix_format.yaml, print_type.yaml, si.yaml, total.yaml, type_filter.yaml. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b5392e09b8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Four P2 items, all verified against gdf 9.10 locally:
1. tmpfs / devtmpfs are no longer in pseudoTypes. They report real
storage (/dev/shm, /run) and GNU lists nonzero tmpfs mounts in the
default output. Default df under-reported them previously.
2. df -t TYPE / -x TYPE now exit 1 with "no file systems processed" on
stderr when filtering leaves zero rows. GNU df returns 1 in this
case and scripts test the exit status to detect missing filesystem
types — silent exit-0 with empty body was a regression.
3. df -ih / df -iH now scales inode counts via humanBytes (e.g. 4.0G,
381K). Previously inode mode bypassed human formatting and always
printed raw integers, which broke a common usage even though both
-i and -h are documented as supported.
4. df -P with -h or -H now uses the "Size" column header instead of
"1024-blocks". GNU's -P -h emits "Size" — keeping the fixed-block
label under human-suffixed values would mislead parsers about units.
Updated tests:
- TestDfTypeFilter_NoMatches: now asserts exit 1 + stderr message.
- TestFormatCount: covers the -ih and -iH cases.
- TestBuildHeader: pins -P -h and -P -H to "Size".
- TestDfPentest{VeryLongTypeName, ManyTypeFilters, TypeFilterEdgeValues,
NonUTF8FlagValue, UnicodeNFD, QuotedValues}: relaxed to accept exit 0
or 1 (no-match values are common in pentest inputs).
- TestDfPentestTypeIncludeAndExcludeSameType: now asserts the new
exit-1 + "no file systems processed" contract.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two follow-on test fixes after the round-2 GNU-compat changes (commit 7b62dfc): 1. TestParseMountInfo_HappyPath: asserted devtmpfs.Pseudo == true, but devtmpfs was just removed from pseudoTypes. Flip the assertion to match the new (correct) classification, and add an analogous check for the /run tmpfs entry to lock in tmpfs.Pseudo == false. 2. TestDfPentestAllFlagsAtOnce: asserted exit 0, but on Linux the -t apfs filter matches no rows so df now correctly emits "no file systems processed" and exits 1 (the new GNU-compat path). Relax to accept either 0 or 1; the pentest's contract is "stacking every flag does not crash", not "succeeds with this specific argv on every host". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1a7d813ef6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…ception Three Codex round-3 items: 1. P1 — pre-stat filter (real bug). statfs(2) on a stale NFS / CIFS mount can hang indefinitely and is not interrupted by ctx cancellation. Previously diskstats.List statfs'd every mount in /proc/self/mountinfo before df.go could apply -l / -x nfs filters, so `df -l` could hang on a dead remote even though the user had explicitly asked to skip remote mounts. Fix: diskstats.List now takes a FilterFunc parameter. df.go's makePreStatFilter encodes -t/-x/-a/-l in a closure that runs between mountinfo parsing and the statfs syscall on Linux. Darwin already uses MNT_NOWAIT so the filter is cosmetic there. 2. P1 — AllowedPaths and statfs (documentation). statfs returns metadata only (block/inode counts, fs type, block size); no file content is read. The mount-point paths are kernel-controlled, never user-derived. This is the same exception class as ss reading all sockets and ip route reading the full routing table — gating it on AllowedPaths would produce a misleading partial listing. Documented under "Security Design Decisions" in AGENTS.md so the trade-off is explicit. 3. P2 — dedup bind-mounts. GNU df hides mounts that share a Source with an already-emitted mount unless -a is given. On container hosts with overlay bind-mounts of /etc/hosts, /etc/hostname, /etc/resolv.conf, default df was printing three duplicate rows and --total was double-counting them. Fix: filterMounts now keeps a seen-Source set; only the first mount for each Source is emitted unless -a is set. Empty Source values (rare; some pseudo filesystems) are not collapsed onto each other. Tests: - New TestPreStatFilter_* coverage replaces TestFilterMounts_* for the type/pseudo/local logic that moved into makePreStatFilter. - New TestFilterMounts_DedupBySourceWithoutAll, _AllPreservesDuplicates, _EmptySourceNotDeduped lock in the dedup behaviour. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fc77c89e8a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
… promotion
Three Codex round-4 items, all real:
1. P1 — FUSE remote subtypes (hang protection for sshfs etc.). Linux
mountinfo reports FUSE mounts as fuse.<subtype>, e.g. fuse.sshfs
or fuse.smbnetfs. The bare prefix list ("sshfs", "smb", …) did not
match these strings, so a stale fuse.sshfs mount was still statfs'd
and could hang despite -l / -x. Added explicit fuse.<remote-backend>
entries: fuse.sshfs, fuse.smb, fuse.cifs, fuse.davfs, fuse.glusterfs,
fuse.cephfs, fuse.nfs, fuse.s3, fuse.rclone. TestIsRemoteType extended
with positive cases for each subtype and negative cases for the
local FUSE backends (gvfsd-fuse, portal, archivemount).
2. P1 — Document df bypass in README.md. The "Security Model" note
already covered ss and ip route but not df; operators reading the
README would miss that AllowedPaths cannot hide mount metadata
from df. Updated the bullet to list df alongside ss/ip route and
to spell out that Statfs(2) returns metadata only.
3. P2 — humanBytes promotes after rounding. humanBytes(1048575, 1024)
was returning "1024K" instead of "1.0M" because the suffix was
chosen before the ceiling. Refactored: one rounding pass with
granularity decided by pre-rounded magnitude (one decimal if
scaled < 10, integer otherwise), then a single promotion step when
the rounded value reaches base. df -h now matches gdf -h
byte-for-byte on my host (927G/512G/415G — previously 926G off-by-one).
Restored the 1<<20-1 → 1.0M test case I had dropped, plus 1<<30-1 →
1.0G, 1<<40-1 → 1.0T, 10485759 → 10M.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9d94c7f3d9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if _, dup := seen[m.Source]; dup { | ||
| continue |
There was a problem hiding this comment.
Deduplicate by device and shortest mount point
GNU df documents that, without -a, bind-mount duplicates are elided by device and the shortest mount point is kept. This source-key check keeps the first same-source mount instead; on hosts like this container, kataShared appears as /etc/resolv.conf, /etc/hostname, and /etc/hosts, so rshell reports /etc/resolv.conf while GNU reports /etc/hosts, and unrelated filesystems that share a source string can be dropped. Use device identity and shortest target rather than Source alone.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Agreed. Added a DevID field to Mount (parsed from /proc/self/mountinfo field 2 on Linux, formatted from Statfs_t.Fsid on Darwin) and rewrote filterMounts as a two-pass dedup: first pass picks the index of the shortest-mountpoint entry per DevID, second pass emits in original order. Two unrelated overlay mounts with distinct DevIDs are no longer collapsed (the kataShared bug). Tests: TestFilterMounts_DedupByDevicePicksShortestMountpoint (kataShared scenario verbatim) and TestFilterMounts_DistinctDeviceSameSourceNotDeduped.
| if local && !m.Local { | ||
| return false |
There was a problem hiding this comment.
Preserve pseudo filesystems for df -al
When both -a and -l are supplied, -a explicitly asks to include dummy/pseudo filesystems while -l only limits out remote filesystems; GNU df -al includes local pseudo mounts such as proc, sysfs, and cgroup2. Because Local is false for every pseudo mount, this filter drops them anyway whenever -l is set, so df -al silently omits the entries -a was meant to restore. Track remote-ness separately or let pseudo mounts pass when all is set.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Agreed. Local was conflating "not pseudo" with "not remote". Per GNU df, pseudo and remote are independent classifications — pseudo mounts (proc, sysfs, cgroup2) are local. Changed the Linux and Darwin classifications to Local := !isRemoteType(...) so pseudo mounts pass the -l filter. Verified df -al now lists more entries than default (pseudo mounts re-enabled by -a still survive -l). Added TestPreStatFilter_AllPlusLocalKeepsPseudo to lock this in.
| // Allow comma-separated lists, matching GNU df. | ||
| for p := range strings.SplitSeq(v, ",") { | ||
| if p == "" { |
There was a problem hiding this comment.
Keep comma-separated fstype arguments literal
GNU df specifies that multiple filesystem types are selected by giving multiple -t options, not by comma-splitting one argument. This split makes df -t overlay,tmpfs match tmpfs mounts in rshell, while GNU treats overlay,tmpfs as one literal type and exits with no file systems processed; scripts that rely on that exit status will get the wrong result.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Agreed. Verified empirically: gdf -t apfs,ext4 exits 1 with "no file systems processed" — GNU treats apfs,ext4 as a single literal type. Removed comma-splitting from stringSet; -t / -x values are now stored as keyed by the literal argv string. Updated TestStringSet to pin the new behaviour and removed strings.SplitSeq from the symbol allowlist (no longer used).
…split -t Three Codex round-5 items, all real: 1. P2 — pseudo + -l. Mount.Local was defined as "not pseudo and not remote", so pseudo mounts (proc, sysfs, cgroup2) had Local=false and were silently dropped by `df -al`. GNU df treats pseudo and remote as independent classifications: -l drops only remote filesystems, and pseudo passes when -a is set. Redefined Local := !isRemoteType(...) on both Linux and Darwin so pseudo mounts are local (they live in kernel memory, not on a remote server). New TestPreStatFilter_AllPlusLocalKeepsPseudo locks this in. Verified locally: `df -al` now lists more entries than `df`, pseudo mounts re-enabled by -a survive -l. 2. P2 — dedup by device + shortest mount point. The previous Source-string dedup was brittle: two unrelated overlay mounts sharing a literal source name were collapsed (the kataShared bug Codex flagged), and the chosen representative depended on input order rather than mount-point length. Added a DevID field to Mount (parsed from /proc/self/mountinfo field index 2 on Linux, formatted from Statfs_t.Fsid on Darwin) and rewrote filterMounts as a two-pass dedup: first pass picks the index of the shortest-mountpoint entry per DevID, second pass emits in original order. Distinct DevIDs are preserved even when Source matches. New tests: TestFilterMounts_DedupByDevicePicksShortestMountpoint (kataShared scenario verbatim) and TestFilterMounts_DistinctDeviceSameSourceNotDeduped. 3. P3 — drop comma-split in -t / -x. Verified empirically: `gdf -t apfs,ext4` exits 1 with "no file systems processed" — GNU treats the entire string as a single literal type. The previous code split on commas and matched either side, which broke scripts that used the no-match exit code as a presence test. Updated stringSet to store the verbatim argv values; removed strings.SplitSeq from the symbol allowlist. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 05b3504ff3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| m.Total = uint64(st.Blocks) * bsize | ||
| m.Free = uint64(st.Bavail) * bsize | ||
| // Used is computed from f_blocks - f_bfree (root-reserved | ||
| // blocks are counted as used), which differs from Total - Free. | ||
| m.Used = subSat(uint64(st.Blocks), uint64(st.Bfree)) * bsize |
There was a problem hiding this comment.
Saturate statfs block multiplications
When a FUSE or otherwise buggy filesystem reports Blocks, Bavail, or Bfree above math.MaxUint64 / bsize, these products wrap before the later saturatingAdd total logic sees them, so a single rogue mount can print tiny/zero sizes and corrupt --total. This also violates docs/RULES.md's requirement that commands check integer overflow in arithmetic; use a checked/saturating multiply for Total, Free, and Used before formatting or accumulating.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Agreed. Added a mulSat(a, b uint64) helper in diskstats_unix.go that clamps to MaxUint64 when a*b would wrap. The Linux and Darwin backends now use mulSat for Total, Free, and Used instead of plain *. A buggy/malicious FUSE filesystem reporting block counts above MaxUint64/bsize can no longer wrap a single mount to a tiny size or corrupt --total. Locked in by TestMulSat covering boundary, just-over-boundary, and the realistic FUSE-rogue (maxU * 4096) case.
| if *f.human { | ||
| return unitsHuman1024 | ||
| } | ||
| if *f.si { | ||
| return unitsHuman1000 |
There was a problem hiding this comment.
When users pass both unit flags, e.g. df -hH, pflag records both booleans but this priority always returns IEC/1024 units. GNU df documents -h as powers of 1024 and -H/--si as powers of 1000 (https://www.gnu.org/software/coreutils/df), and the later option determines the requested mode (df -hH / should be SI while df -Hh / should be IEC), so scripts that append -H to override -h get the wrong sizes.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Agreed — the previous code unconditionally preferred -h over -H. pflag.FlagSet.Visit walks set flags in lexicographical order, not argv order, so it could not be used to honor input order. Refactored to a custom unitFlag (a pflag.Value) where -h and -H share a single *unitMode target and each Set call overwrites it — the LAST one wins by construction. Confirmed locally: df -hH now prints SI (995G on my host) while df -Hh prints IEC (927G). Locked in by TestUnitFlag_LastFlagWins covering 10 argv interleavings (separate flags, combined short, long-form, with non-unit flags between).
Two Codex round-6 items, both real: 1. P1 — saturating block multiply. unix.Statfs_t.Blocks * bsize could wrap a uint64 if a buggy/malicious FUSE filesystem reports counts above MaxUint64/bsize, producing tiny/zero Total/Used/Free for that mount and corrupting the --total accumulation. Added a mulSat(a, b) helper to diskstats_unix.go that clamps to MaxUint64 on overflow. Linux and Darwin backends now use mulSat for Total, Free, and Used. Locked in by TestMulSat covering boundary, just-over-boundary, and the realistic FUSE-rogue (maxU * 4096) case. 2. P2 — last-flag-wins for -h / -H. The previous code unconditionally preferred -h, so `df -hH` (intended SI override) and shell aliases that append -H to a default got the wrong size column. pflag.Visit walks set flags in lexicographical order, not argv order, so it cannot be used to honor input order. Refactored to a custom unitFlag (a pflag.Value) where -h and -H share a single *unitMode target and each Set call overwrites it — the LAST one wins by construction. registerUnitFlag wraps fs.VarPF + NoOptDefVal="true" so the flags accept no argument, matching pflag's bool convention. Confirmed locally: `df -hH` prints SI (995G), `df -Hh` prints IEC (927G). New TestUnitFlag_LastFlagWins covers 10 argv interleavings including combined short flags (-hH, -Hh). Side effect of #2: dropped the unused `human`/`si` *bool fields from the flags struct and the resolveUnitMode helper; mode is now read directly from a *unitMode field. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cc63258ac6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // -k is registered separately because it has no long form. It is a | ||
| // no-op in this v1 implementation — 1024-byte blocks are already | ||
| // the default — but POSIX scripts pass it explicitly. | ||
| fs.BoolP("kibibytes", "k", false, "use 1024-byte blocks (POSIX default)") |
There was a problem hiding this comment.
Let -k override earlier human-size flags
When -k appears after a human-readable size flag, this no-op registration leaves mode set to unitsHuman1024/unitsHuman1000, so df -h -k, df -H -k, and compact forms like df -hk still print Size with human units. GNU documents -k as “equivalent to --block-size=1K” (and local df -h -k prints a 1K-blocks header), so scripts that rely on option-order compatibility get the wrong columns and values.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Agreed — verified empirically: gdf -h -k prints 1K-blocks and gdf -k -h prints Size. Promoted -k into the same last-flag-wins group as -h / -H by registering it via registerUnitFlag with unitsK value. Now all three share the *unitMode target and the LAST one in argv wins. Locked in by extending TestUnitFlag_LastFlagWins with six new cases (-h -k, -H -k, -k -h, -k -H, -hk, -kh).
| if _, ok := excludeSet[m.FSType]; ok { | ||
| return false | ||
| } | ||
| if len(includeSet) > 0 { | ||
| if _, ok := includeSet[m.FSType]; !ok { | ||
| return false |
There was a problem hiding this comment.
Reject overlapping -t and -x filters
When the same filesystem type is passed to both -t and -x, this branch makes exclusion silently win; for example df -t tmpfs -t ext4 -x tmpfs succeeds here and lists ext4, while GNU df exits 1 with “file system type ‘tmpfs’ both selected and excluded”. That hides an invalid invocation and diverges from the bash-compatible behavior required for these builtins whenever configs/scripts accidentally specify contradictory type filters.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Agreed — verified gdf -t apfs -x apfs exits 1 with "file system type 'apfs' both selected and excluded". Added an overlappingType helper that scans the include / exclude argv for any conflicting type and returns it (or ""). The handler now checks this before any other work and emits the same GNU-format error, so contradictory configs fail loudly instead of silently picking exclude. TestOverlappingType covers nil/empty/disjoint/single/multiple-overlap inputs. Updated TestDfPentestTypeIncludeAndExcludeSameType to assert the new error. The lower-level TestPreStatFilter_TypeExcludeWinsOverIncludeOnPseudo is kept and re-doc'd as testing the filter layer's exclude-precedence in isolation, since the top-level handler now blocks that input from ever reaching the filter.
Two Codex round-7 items, both verified empirically against gdf 9.10: 1. P2 — -k participates in the unit-mode last-flag-wins group. Before, -k was a no-op bool flag that ran alongside -h / -H without ever updating the shared mode, so `df -h -k` left mode = unitsHuman1024 even though gdf -h -k prints "1K-blocks". Promoted -k into the same registerUnitFlag group with unitsK value: all three flags now share the *unitMode target and the LAST argv entry wins. Locked in: TestUnitFlag_LastFlagWins gains six new cases (-h -k, -H -k, -k -h, -k -H, -hk, -kh). Smoke verified: df -h -k now prints "1K-blocks", df -k -h prints "Size". 2. P2 — reject overlapping -t / -x. gdf -t apfs -x apfs exits 1 with "file system type 'apfs' both selected and excluded". The previous code silently let exclusion win. Added an overlappingType helper and an early-out check in the handler that emits the GNU-format error before any mount listing runs. Updated TestDfPentestTypeIncludeAndExcludeSameType to assert the new error; kept TestPreStatFilter_TypeExcludeWinsOverIncludeOnPseudo to lock in the filter's lower-level exclude-precedence behaviour in isolation. Added TestOverlappingType for the helper itself. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 755c2fe837
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if withType { | ||
| cols = append(cols, "Type") | ||
| } | ||
| cols = append(cols, col1, "Used", "Available", capacity, last) |
There was a problem hiding this comment.
Use GNU's Avail header for human-readable output
When users run df -h or df -H, GNU coreutils labels the available-space column Avail (for example, GNU df 9.4 prints Filesystem Size Used Avail Use% Mounted on), but this unconditional Available makes rshell diverge in one of the most common df modes. Any bash/coreutils comparison scenario for human-readable df output will fail; keep Available for fixed-block output and switch to Avail when the selected unit mode is human-readable.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Agreed — verified gdf -h prints Filesystem Size Used Avail Use% Mounted on (no 'e' on Avail). buildHeader now switches Available → Avail when the unit mode is unitsHuman1024 or unitsHuman1000; fixed-block modes (default, -k, -P) keep the full Available. TestBuildHeader covers both human cases and TestGNUCompatHeaderHuman was strengthened to assert Avail is present and Available is not.
| if posix { | ||
| // POSIX output format for inodes still uses "Capacity". | ||
| cols[len(cols)-2] = "Capacity" |
There was a problem hiding this comment.
Keep IUse% for inode output with -P
When -P is combined with inode mode, GNU df keeps the inode percentage header as IUse% (df -iP prints Filesystem Inodes IUsed IFree IUse% Mounted on), but this replacement changes it to Capacity. Scripts or bash-comparison tests that inspect the inode header will see a different column name; only the block POSIX format should use Capacity.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Agreed — verified gdf -iP prints Filesystem Inodes IUsed IFree IUse% Mounted on (the inode percentage stays IUse%, not Capacity). Removed the conditional that rewrote IUse% → Capacity when -P was set in inode mode. Block POSIX (df -P without -i) still uses Capacity as required. Updated TestBuildHeader to assert that -iP keeps IUse% and does NOT contain Capacity.
Two Codex round-8 items, both verified empirically against gdf 9.10: 1. P2 — GNU compresses "Available" to "Avail" in human modes (`gdf -h /` → "Filesystem Size Used Avail Use% Mounted on"). My buildHeader unconditionally emitted "Available", so any bash-comparison scenario for `df -h` / `df -H` would diverge. buildHeader now switches to "Avail" when the unit mode is unitsHuman1024 or unitsHuman1000; fixed-block modes (default, -k, -P) keep the full "Available" string. 2. P2 — `gdf -iP` keeps "IUse%" as the percentage header, not "Capacity". Only the *block* POSIX format substitutes "Capacity" for "Use%". My code unconditionally replaced "IUse%" with "Capacity" when posix was set in inode mode, which diverged from GNU. Removed the conditional; the inode header is now always "IUse%" regardless of -P. Tests: TestBuildHeader expanded to pin both behaviours (Avail in -h / -H, IUse% preserved with -iP, IUse% not replaced by Capacity). Also strengthened TestGNUCompatHeaderHuman to explicitly check that "Avail" is present and "Available" is NOT. Smoke verified locally: df -h → Filesystem ... Size Used Avail Use% Mounted on df -iP → Filesystem ... Inodes IUsed IFree IUse% Mounted on df -P → Filesystem ... 1024-blocks Used Available Capacity Mounted on Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@codex review |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5b3d483519
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| posix := *f.posix | ||
| withType := *f.printType | ||
| inodeMode := *f.inodes |
There was a problem hiding this comment.
Limit POSIX layout to fixed block output
When -P is combined with -T, -i, -h, or -H, GNU df uses the normal aligned layout for these GNU-extended modes (for example, df -Pi and df -PT have padded columns, and df -hP keeps the Use% header). Because posix stays true here, printRows emits single-space rows and buildHeader can rename the percent column to Capacity, so supported flag combinations diverge from bash/coreutils output.
Useful? React with 👍 / 👎.
| // -h, and `df -k -h` prints "Size" for the reverse reason. | ||
| registerUnitFlag(fs, &mode, unitsHuman1024, "human-readable", "h", "print sizes in powers of 1024 (e.g. 1023M)") | ||
| registerUnitFlag(fs, &mode, unitsHuman1000, "si", "H", "print sizes in powers of 1000 (e.g. 1.1G)") | ||
| registerUnitFlag(fs, &mode, unitsK, "kibibytes", "k", "use 1024-byte blocks (POSIX default)") |
There was a problem hiding this comment.
Reject the nonstandard --kibibytes flag
This long name makes df --kibibytes succeed even though the documented/GNU-compatible interface only has short -k (GNU documents -k as equivalent to --block-size=1K, not --kibibytes). Since unsupported flags are supposed to fail, scripts can accidentally rely on rshell-only behavior and help advertises a flag that real df rejects.
Useful? React with 👍 / 👎.
Summary
Adds
dfas a sandboxed builtin so AI-agent scripts can inspect mounted filesystem usage without invoking the hostdfbinary. v1 supports Linux + macOS; Windows returns "not supported" (mirroringuname).builtins/internal/diskstats, that reads/proc/self/mountinfoon Linux and callsgetfsstat(2)on macOS. The/procread is exempt fromAllowedPathsfor the same reasonssandip routeare — the path is hardcoded, never derived from user input.--syncflag (which would invokesync(2)and mutate kernel buffer state) is unregistered and rejected by pflag as unknown. GTFOBins has nodfentry.Flag set
Implemented:
-h,-H,-k,-P,-T,-i,-a,-l,-t TYPE(repeatable),-x TYPE(repeatable),--total,--no-sync,--help.Deferred to v2:
[FILE]...operands,-B/--block-size,--output[=FIELDS].Rejected (unknown to pflag):
--sync,-v,--version.Safety bounds (per
docs/RULES.md)ErrMaxMountsreturned when truncated; both Linux and Darwin honour this for parity)./proc/self/mountinfoline cap of 1 MiB (errLineTooLong).percentUseduses paired right-shifts to avoidused*100overflow at extreme magnitudes.saturatingAddis used for grand-total accumulation so a rogue mount cannot wrap the running totals.formatCountuses floor + remainder bump to avoid wraparound when grand totals saturate toMaxUint64.ctx.Err()is checked at the top of every per-mount loop on both backends.Test plan
go test ./...(full suite passes locally on darwin/arm64).make fmtclean.go vet ./...cross-platform (GOOS=linux,GOOS=windows) clean.MaxMountstruncation, line-too-long, context cancellation.%,--totalrow equals saturated sum of per-mount columns,-tfilter keeps only matching rows,-xfilter removes them.-h,-i,-T,--total).-tflags, weird type values (empty/whitespace/comma/UTF-8/non-UTF-8/Unicode NFD), shell-metacharacter values, attempted config overrides (--proc-path,--mountinfo,--root,--prefix).FuzzParseMountInfo,FuzzUnescapeMountField,FuzzDfFlagCombinatorwith seed corpora drawn from implementation edge cases, CVE-class inputs (NUL bytes, CRLF, invalid UTF-8, ELF/PE/ZIP magic prefixes), and replays of every existing test input. Added to.github/workflows/fuzz.ymlso they run in CI.helplistsdfwith the right description;help dfshows full usage.🤖 Generated with Claude Code