From f34fd54c64db70f1da65391359b1f22800b70436 Mon Sep 17 00:00:00 2001 From: Alexandre Yang Date: Thu, 12 Mar 2026 00:55:26 +0100 Subject: [PATCH 01/20] Implement printf builtin command MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add a safe printf builtin supporting all standard format specifiers (%s, %b, %c, %d, %i, %o, %u, %x, %X, %e, %E, %f, %F, %g, %G, %%), escape sequences, width/precision modifiers (including * from args), and format string reuse for excess arguments. Safety measures: %n rejected (security risk), -v rejected, format reuse bounded to 10,000 iterations with context cancellation checks, width/precision clamped to ±10,000 to prevent memory exhaustion. Includes 80+ Go unit tests, 25 GNU compatibility tests, 41 pentest tests, and 46 YAML scenario tests. Co-Authored-By: Claude Opus 4.6 --- SHELL_FEATURES.md | 1 + interp/builtins/printf/printf.go | 764 ++++++++++++++++++ .../builtins/printf/printf_gnu_compat_test.go | 269 ++++++ interp/builtins/printf/printf_pentest_test.go | 331 ++++++++ interp/builtins/printf/printf_test.go | 748 +++++++++++++++++ interp/register_builtins.go | 2 + tests/allowed_symbols_test.go | 10 + .../cmd/printf/basic/format_only.yaml | 9 + .../cmd/printf/basic/format_reuse.yaml | 11 + .../cmd/printf/basic/missing_arg_number.yaml | 9 + .../cmd/printf/basic/missing_arg_string.yaml | 8 + .../cmd/printf/basic/multiple_args.yaml | 9 + tests/scenarios/cmd/printf/basic/no_args.yaml | 8 + .../cmd/printf/basic/percent_literal.yaml | 9 + .../cmd/printf/basic/simple_string.yaml | 9 + .../cmd/printf/errors/invalid_number.yaml | 9 + .../cmd/printf/errors/no_format.yaml | 8 + .../printf/errors/rejected_n_specifier.yaml | 8 + .../cmd/printf/errors/rejected_v_flag.yaml | 9 + .../cmd/printf/escapes/backslash.yaml | 8 + .../cmd/printf/escapes/bell_and_others.yaml | 8 + .../cmd/printf/escapes/carriage_return.yaml | 8 + tests/scenarios/cmd/printf/escapes/hex.yaml | 9 + .../scenarios/cmd/printf/escapes/newline.yaml | 10 + tests/scenarios/cmd/printf/escapes/octal.yaml | 9 + tests/scenarios/cmd/printf/escapes/tab.yaml | 8 + .../cmd/printf/numeric/char_constant.yaml | 9 + .../cmd/printf/numeric/hex_input.yaml | 9 + .../cmd/printf/numeric/negative.yaml | 9 + .../cmd/printf/numeric/octal_input.yaml | 9 + tests/scenarios/cmd/printf/numeric/zero.yaml | 9 + .../shell_features/command_substitution.yaml | 10 + .../printf/shell_features/in_for_loop.yaml | 8 + .../printf/shell_features/in_pipeline.yaml | 9 + .../shell_features/variable_expansion.yaml | 9 + .../cmd/printf/specifiers/b_escape.yaml | 8 + .../printf/specifiers/b_with_backslash_c.yaml | 8 + .../cmd/printf/specifiers/char_c.yaml | 9 + .../cmd/printf/specifiers/decimal_d.yaml | 9 + .../cmd/printf/specifiers/float_f.yaml | 9 + .../cmd/printf/specifiers/hex_lower.yaml | 9 + .../cmd/printf/specifiers/hex_upper.yaml | 9 + .../cmd/printf/specifiers/integer_i.yaml | 9 + .../cmd/printf/specifiers/octal_o.yaml | 9 + .../cmd/printf/specifiers/scientific_e.yaml | 9 + .../cmd/printf/specifiers/shortest_g.yaml | 9 + .../cmd/printf/specifiers/string_s.yaml | 9 + .../cmd/printf/specifiers/unsigned_u.yaml | 9 + .../printf/width_precision/left_align.yaml | 9 + .../width_precision/precision_float.yaml | 9 + .../width_precision/precision_string.yaml | 9 + .../printf/width_precision/right_align.yaml | 8 + .../cmd/printf/width_precision/zero_pad.yaml | 9 + 53 files changed, 2531 insertions(+) create mode 100644 interp/builtins/printf/printf.go create mode 100644 interp/builtins/printf/printf_gnu_compat_test.go create mode 100644 interp/builtins/printf/printf_pentest_test.go create mode 100644 interp/builtins/printf/printf_test.go create mode 100644 tests/scenarios/cmd/printf/basic/format_only.yaml create mode 100644 tests/scenarios/cmd/printf/basic/format_reuse.yaml create mode 100644 tests/scenarios/cmd/printf/basic/missing_arg_number.yaml create mode 100644 tests/scenarios/cmd/printf/basic/missing_arg_string.yaml create mode 100644 tests/scenarios/cmd/printf/basic/multiple_args.yaml create mode 100644 tests/scenarios/cmd/printf/basic/no_args.yaml create mode 100644 tests/scenarios/cmd/printf/basic/percent_literal.yaml create mode 100644 tests/scenarios/cmd/printf/basic/simple_string.yaml create mode 100644 tests/scenarios/cmd/printf/errors/invalid_number.yaml create mode 100644 tests/scenarios/cmd/printf/errors/no_format.yaml create mode 100644 tests/scenarios/cmd/printf/errors/rejected_n_specifier.yaml create mode 100644 tests/scenarios/cmd/printf/errors/rejected_v_flag.yaml create mode 100644 tests/scenarios/cmd/printf/escapes/backslash.yaml create mode 100644 tests/scenarios/cmd/printf/escapes/bell_and_others.yaml create mode 100644 tests/scenarios/cmd/printf/escapes/carriage_return.yaml create mode 100644 tests/scenarios/cmd/printf/escapes/hex.yaml create mode 100644 tests/scenarios/cmd/printf/escapes/newline.yaml create mode 100644 tests/scenarios/cmd/printf/escapes/octal.yaml create mode 100644 tests/scenarios/cmd/printf/escapes/tab.yaml create mode 100644 tests/scenarios/cmd/printf/numeric/char_constant.yaml create mode 100644 tests/scenarios/cmd/printf/numeric/hex_input.yaml create mode 100644 tests/scenarios/cmd/printf/numeric/negative.yaml create mode 100644 tests/scenarios/cmd/printf/numeric/octal_input.yaml create mode 100644 tests/scenarios/cmd/printf/numeric/zero.yaml create mode 100644 tests/scenarios/cmd/printf/shell_features/command_substitution.yaml create mode 100644 tests/scenarios/cmd/printf/shell_features/in_for_loop.yaml create mode 100644 tests/scenarios/cmd/printf/shell_features/in_pipeline.yaml create mode 100644 tests/scenarios/cmd/printf/shell_features/variable_expansion.yaml create mode 100644 tests/scenarios/cmd/printf/specifiers/b_escape.yaml create mode 100644 tests/scenarios/cmd/printf/specifiers/b_with_backslash_c.yaml create mode 100644 tests/scenarios/cmd/printf/specifiers/char_c.yaml create mode 100644 tests/scenarios/cmd/printf/specifiers/decimal_d.yaml create mode 100644 tests/scenarios/cmd/printf/specifiers/float_f.yaml create mode 100644 tests/scenarios/cmd/printf/specifiers/hex_lower.yaml create mode 100644 tests/scenarios/cmd/printf/specifiers/hex_upper.yaml create mode 100644 tests/scenarios/cmd/printf/specifiers/integer_i.yaml create mode 100644 tests/scenarios/cmd/printf/specifiers/octal_o.yaml create mode 100644 tests/scenarios/cmd/printf/specifiers/scientific_e.yaml create mode 100644 tests/scenarios/cmd/printf/specifiers/shortest_g.yaml create mode 100644 tests/scenarios/cmd/printf/specifiers/string_s.yaml create mode 100644 tests/scenarios/cmd/printf/specifiers/unsigned_u.yaml create mode 100644 tests/scenarios/cmd/printf/width_precision/left_align.yaml create mode 100644 tests/scenarios/cmd/printf/width_precision/precision_float.yaml create mode 100644 tests/scenarios/cmd/printf/width_precision/precision_string.yaml create mode 100644 tests/scenarios/cmd/printf/width_precision/right_align.yaml create mode 100644 tests/scenarios/cmd/printf/width_precision/zero_pad.yaml diff --git a/SHELL_FEATURES.md b/SHELL_FEATURES.md index 40746388..3a0fdf37 100644 --- a/SHELL_FEATURES.md +++ b/SHELL_FEATURES.md @@ -15,6 +15,7 @@ Blocked features are rejected before execution with exit code 2. - ✅ `grep [-EFGivclLnHhoqsxw] [-e PATTERN] [-m NUM] [-A NUM] [-B NUM] [-C NUM] PATTERN [FILE]...` — print lines that match patterns; uses RE2 regex engine (linear-time, no backtracking) - ✅ `head [-n N|-c N] [-q|-v] [-z] [FILE]...` — output the first part of files (default: first 10 lines) - ✅ `ls [-1aAdFhlpRrSt] [FILE]...` — list directory contents +- ✅ `printf FORMAT [ARGUMENT]...` — format and print data to stdout; supports `%s`, `%b`, `%c`, `%d`, `%i`, `%o`, `%u`, `%x`, `%X`, `%e`, `%E`, `%f`, `%F`, `%g`, `%G`, `%%`; format reuse for excess arguments; `%n` rejected (security risk); `-v` rejected - ✅ `strings [-a] [-n MIN] [-t o|d|x] [-o] [-f] [-s SEP] [FILE]...` — print printable character sequences in files (default min length 4); offsets via `-t`/`-o`; filename prefix via `-f`; custom separator via `-s` - ✅ `tail [-n N|-c N] [-q|-v] [-z] [FILE]...` — output the last part of files (default: last 10 lines); supports `+N` offset mode; `-f`/`--follow` is rejected - ✅ `true` — return exit code 0 diff --git a/interp/builtins/printf/printf.go b/interp/builtins/printf/printf.go new file mode 100644 index 00000000..e391553f --- /dev/null +++ b/interp/builtins/printf/printf.go @@ -0,0 +1,764 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +// Package printf implements the printf builtin command. +// +// printf — format and print data +// +// Usage: printf FORMAT [ARGUMENT]... +// +// Write formatted output to standard output. FORMAT is a string that +// contains literal text and format specifiers (introduced by %). Each +// format specifier consumes the next ARGUMENT and formats it. +// +// If there are more ARGUMENTs than format specifiers, the FORMAT string +// is reused from the beginning until all arguments are consumed (bounded +// to 10,000 iterations to prevent runaway loops). +// +// Missing arguments default to "" for string specifiers and 0 for +// numeric specifiers. +// +// Accepted flags: +// +// -h, --help +// Print this usage message to stdout and exit 0. +// +// Rejected flags: +// +// -v varname +// Bash extension to assign output to a variable. Not supported +// in the restricted shell. +// +// Format specifiers: +// +// %s String. +// %b String with backslash escape interpretation (like echo -e). +// \c in %b stops all further output. +// %c First character of the argument. +// %d, %i Signed decimal integer. +// %o Unsigned octal integer. +// %u Unsigned decimal integer. +// %x, %X Unsigned hexadecimal integer (lower/upper). +// %e, %E Scientific notation float. +// %f, %F Decimal float. +// %g, %G Shortest float representation. +// %% Literal percent sign. +// +// Width and precision modifiers are supported (e.g. %10s, %-10s, %.5f, +// %010d). Flag characters: - (left-align), + (sign), ' ' (space), +// 0 (zero-pad), # (alternate form). +// +// Escape sequences in FORMAT string: +// +// \\ backslash +// \a alert (BEL) +// \b backspace +// \f form feed +// \n newline +// \r carriage return +// \t horizontal tab +// \v vertical tab +// \" double quote +// \NNN octal byte value (1-3 digits) +// \0NNN octal byte value (0 + 1-3 digits) +// \xHH hexadecimal byte value (1-2 digits) +// +// Numeric argument extensions: +// +// Arguments for numeric specifiers may be: +// - Decimal integers: 42, -7, +3 +// - Octal: 0755 +// - Hexadecimal: 0xff, 0XFF +// - Character constants: "'A" or '"A' gives the ASCII value of A +// +// Not implemented (rejected): +// +// %n Byte count write (security risk). Produces an error. +// %q Shell-quoting (bash extension, not POSIX). +// %a, %A Hexadecimal float (deferred). +// +// Exit codes: +// +// 0 Successful completion (conversion warnings may still be emitted). +// 1 Usage error or format string missing. +// +// Memory safety: +// +// printf does not read files or stdin. All output is generated from +// the format string and arguments. The format reuse loop is bounded +// to maxFormatIterations (10,000) and checks ctx.Err() on each +// iteration to honour the shell's execution timeout. +package printf + +import ( + "context" + "fmt" + "math" + "strconv" + "strings" + + "github.com/DataDog/rshell/interp/builtins" +) + +// Cmd is the printf builtin command descriptor. +// printf uses NoFlags because its arguments (format string and data) can look +// like flags (e.g. printf "%d" -42). Manual pre-parsing handles --help and -v. +var Cmd = builtins.Command{Name: "printf", MakeFlags: builtins.NoFlags(run)} + +// maxFormatIterations bounds the format-reuse loop to prevent runaway output. +const maxFormatIterations = 10_000 + +// maxWidthOrPrec caps width/precision values to prevent huge allocations. +const maxWidthOrPrec = 10_000 + +func run(ctx context.Context, callCtx *builtins.CallContext, args []string) builtins.Result { + // Manual flag handling: only --help/-h is accepted; -v is rejected. + // -- terminates options (allows format strings starting with -). + if len(args) > 0 { + switch args[0] { + case "--help", "-h": + callCtx.Out("Usage: printf FORMAT [ARGUMENT]...\n") + callCtx.Out("Write formatted output to standard output.\n") + return builtins.Result{} + case "-v": + callCtx.Errf("printf: -v: not supported in restricted shell\n") + return builtins.Result{Code: 1} + case "--": + args = args[1:] // skip -- + } + } + + if len(args) == 0 { + callCtx.Errf("printf: usage: printf [-v var] format [arguments]\n") + return builtins.Result{Code: 1} + } + + format := args[0] + fmtArgs := args[1:] + + // Strip a leading "--" from format arguments (allows negative numbers + // after the format string: printf "%d" -- -42). + if len(fmtArgs) > 0 && fmtArgs[0] == "--" { + fmtArgs = fmtArgs[1:] + } + + argIdx := 0 + hadError := false + iterations := 0 + + for { + if ctx.Err() != nil { + break + } + if iterations >= maxFormatIterations { + break + } + iterations++ + + startArgIdx := argIdx + stop, err := processFormat(callCtx, format, fmtArgs, &argIdx, &hadError) + if err { + hadError = true + } + if stop { + // \c in %b — stop all output immediately. + break + } + + // If no args were consumed in this pass, or we've consumed all args, stop. + if argIdx <= startArgIdx || argIdx >= len(fmtArgs) { + break + } + // More args remain — reuse the format string. + } + + if hadError { + return builtins.Result{Code: 1} + } + return builtins.Result{} +} + +// processFormat walks the format string once, outputting literal text and +// processing format specifiers. It returns (stop, hadError). +// stop is true if \c was encountered in a %b argument. +func processFormat(callCtx *builtins.CallContext, format string, args []string, argIdx *int, hadError *bool) (bool, bool) { + i := 0 + for i < len(format) { + ch := format[i] + + if ch == '\\' { + // Process escape sequence in format string. + s, advance := processFormatEscape(format[i:]) + callCtx.Out(s) + i += advance + continue + } + + if ch == '%' { + if i+1 < len(format) && format[i+1] == '%' { + callCtx.Out("%") + i += 2 + continue + } + stop, advance, err := processSpecifier(callCtx, format[i:], args, argIdx) + if err { + *hadError = true + } + if stop { + return true, *hadError + } + i += advance + continue + } + + // Literal character. + callCtx.Out(string(ch)) + i++ + } + return false, *hadError +} + +// processFormatEscape handles a backslash escape in the format string (not in %b arguments). +// Returns the replacement string and the number of bytes consumed from s. +func processFormatEscape(s string) (string, int) { + if len(s) < 2 { + return "\\", 1 + } + switch s[1] { + case '\\': + return "\\", 2 + case 'a': + return "\a", 2 + case 'b': + return "\b", 2 + case 'f': + return "\f", 2 + case 'n': + return "\n", 2 + case 'r': + return "\r", 2 + case 't': + return "\t", 2 + case 'v': + return "\v", 2 + case '"': + return "\"", 2 + case '0': + // \0NNN — octal (0 + up to 3 digits) + val, consumed := parseOctal(s[2:], 3) + return string(rune(val)), 2 + consumed + case 'x': + // \xHH — hex (up to 2 digits) + val, consumed := parseHex(s[2:], 2) + if consumed == 0 { + return "\\x", 2 + } + return string(rune(val)), 2 + consumed + default: + if s[1] >= '1' && s[1] <= '7' { + // \NNN — octal without leading 0 (1-3 digits) + val, consumed := parseOctal(s[1:], 3) + return string(rune(val)), 1 + consumed + } + // Unknown escape: output backslash and character. + return string([]byte{'\\', s[1]}), 2 + } +} + +// processSpecifier handles a single % format specifier starting at s[0]=='%'. +// Returns (stop, bytesConsumed, hadError). +func processSpecifier(callCtx *builtins.CallContext, s string, args []string, argIdx *int) (bool, int, bool) { + i := 1 // skip '%' + hadError := false + + // Parse flags: -, +, ' ', 0, # + var flags strings.Builder + for i < len(s) { + switch s[i] { + case '-', '+', ' ', '0', '#': + flags.WriteByte(s[i]) + i++ + continue + } + break + } + + // Parse width (digits or *) + var width string + if i < len(s) && s[i] == '*' { + // Width from argument. + w, err := getIntArg(args, argIdx, callCtx) + if err { + hadError = true + } + width = strconv.Itoa(w) + i++ + } else { + start := i + for i < len(s) && s[i] >= '0' && s[i] <= '9' { + i++ + } + width = s[start:i] + } + + // Parse precision + var precision string + hasPrecision := false + if i < len(s) && s[i] == '.' { + hasPrecision = true + i++ // skip '.' + if i < len(s) && s[i] == '*' { + p, err := getIntArg(args, argIdx, callCtx) + if err { + hadError = true + } + precision = strconv.Itoa(p) + i++ + } else { + start := i + for i < len(s) && s[i] >= '0' && s[i] <= '9' { + i++ + } + precision = s[start:i] + } + } + + // Clamp width/precision for safety. + if w, err := strconv.Atoi(width); err == nil && (w > maxWidthOrPrec || w < -maxWidthOrPrec) { + if w > 0 { + width = strconv.Itoa(maxWidthOrPrec) + } else { + width = strconv.Itoa(-maxWidthOrPrec) + } + } + if p, err := strconv.Atoi(precision); err == nil && p > maxWidthOrPrec { + precision = strconv.Itoa(maxWidthOrPrec) + } + + if i >= len(s) { + // Incomplete specifier — print what we have. + callCtx.Out(s[:i]) + return false, i, hadError + } + + verb := s[i] + i++ // consume verb + + // Build Go format string. + var goFmt strings.Builder + goFmt.WriteByte('%') + goFmt.WriteString(flags.String()) + goFmt.WriteString(width) + if hasPrecision { + goFmt.WriteByte('.') + goFmt.WriteString(precision) + } + + switch verb { + case 's': + arg := getStringArg(args, argIdx) + goFmt.WriteByte('s') + callCtx.Out(fmt.Sprintf(goFmt.String(), arg)) + + case 'b': + arg := getStringArg(args, argIdx) + processed, stop := processBEscapes(arg) + // Apply width/precision formatting to the processed string. + goFmt.WriteByte('s') + callCtx.Out(fmt.Sprintf(goFmt.String(), processed)) + if stop { + return true, i, hadError + } + + case 'c': + arg := getStringArg(args, argIdx) + if len(arg) > 0 { + // %c prints the first character (byte). + goFmt.WriteByte('c') + callCtx.Out(fmt.Sprintf(goFmt.String(), arg[0])) + } else { + // Empty argument produces a NUL byte (bash behavior). + callCtx.Out("\x00") + } + + case 'd', 'i': + arg := getStringArg(args, argIdx) + val, err := parseIntArg(arg) + if err != nil && arg != "" { + callCtx.Errf("printf: %s: invalid number\n", arg) + // Bash continues with value 0 and sets exit code. + val = 0 + goFmt.WriteByte('d') + callCtx.Out(fmt.Sprintf(goFmt.String(), val)) + return false, i, true + } + goFmt.WriteByte('d') + callCtx.Out(fmt.Sprintf(goFmt.String(), val)) + + case 'o': + arg := getStringArg(args, argIdx) + val, err := parseUintArg(arg) + if err != nil && arg != "" { + callCtx.Errf("printf: %s: invalid number\n", arg) + val = 0 + goFmt.WriteByte('o') + callCtx.Out(fmt.Sprintf(goFmt.String(), val)) + return false, i, true + } + goFmt.WriteByte('o') + callCtx.Out(fmt.Sprintf(goFmt.String(), val)) + + case 'u': + arg := getStringArg(args, argIdx) + val, err := parseUintArg(arg) + if err != nil && arg != "" { + callCtx.Errf("printf: %s: invalid number\n", arg) + val = 0 + goFmt.WriteByte('d') + callCtx.Out(fmt.Sprintf(goFmt.String(), val)) + return false, i, true + } + goFmt.WriteByte('d') + callCtx.Out(fmt.Sprintf(goFmt.String(), val)) + + case 'x': + arg := getStringArg(args, argIdx) + val, err := parseUintArg(arg) + if err != nil && arg != "" { + callCtx.Errf("printf: %s: invalid number\n", arg) + val = 0 + goFmt.WriteByte('x') + callCtx.Out(fmt.Sprintf(goFmt.String(), val)) + return false, i, true + } + goFmt.WriteByte('x') + callCtx.Out(fmt.Sprintf(goFmt.String(), val)) + + case 'X': + arg := getStringArg(args, argIdx) + val, err := parseUintArg(arg) + if err != nil && arg != "" { + callCtx.Errf("printf: %s: invalid number\n", arg) + val = 0 + goFmt.WriteByte('X') + callCtx.Out(fmt.Sprintf(goFmt.String(), val)) + return false, i, true + } + goFmt.WriteByte('X') + callCtx.Out(fmt.Sprintf(goFmt.String(), val)) + + case 'e': + arg := getStringArg(args, argIdx) + val, err := parseFloatArg(arg) + if err != nil && arg != "" { + callCtx.Errf("printf: %s: invalid number\n", arg) + val = 0 + goFmt.WriteByte('e') + callCtx.Out(fmt.Sprintf(goFmt.String(), val)) + return false, i, true + } + goFmt.WriteByte('e') + callCtx.Out(fmt.Sprintf(goFmt.String(), val)) + + case 'E': + arg := getStringArg(args, argIdx) + val, err := parseFloatArg(arg) + if err != nil && arg != "" { + callCtx.Errf("printf: %s: invalid number\n", arg) + val = 0 + goFmt.WriteByte('E') + callCtx.Out(fmt.Sprintf(goFmt.String(), val)) + return false, i, true + } + goFmt.WriteByte('E') + callCtx.Out(fmt.Sprintf(goFmt.String(), val)) + + case 'f': + arg := getStringArg(args, argIdx) + val, err := parseFloatArg(arg) + if err != nil && arg != "" { + callCtx.Errf("printf: %s: invalid number\n", arg) + val = 0 + goFmt.WriteByte('f') + callCtx.Out(fmt.Sprintf(goFmt.String(), val)) + return false, i, true + } + goFmt.WriteByte('f') + callCtx.Out(fmt.Sprintf(goFmt.String(), val)) + + case 'F': + arg := getStringArg(args, argIdx) + val, err := parseFloatArg(arg) + if err != nil && arg != "" { + callCtx.Errf("printf: %s: invalid number\n", arg) + val = 0 + } + // Go doesn't have %F; use %f and uppercase manually. + goFmt.WriteByte('f') + out := fmt.Sprintf(goFmt.String(), val) + out = strings.ToUpper(out) + callCtx.Out(out) + if err != nil && arg != "" { + return false, i, true + } + + case 'g': + arg := getStringArg(args, argIdx) + val, err := parseFloatArg(arg) + if err != nil && arg != "" { + callCtx.Errf("printf: %s: invalid number\n", arg) + val = 0 + goFmt.WriteByte('g') + callCtx.Out(fmt.Sprintf(goFmt.String(), val)) + return false, i, true + } + goFmt.WriteByte('g') + callCtx.Out(fmt.Sprintf(goFmt.String(), val)) + + case 'G': + arg := getStringArg(args, argIdx) + val, err := parseFloatArg(arg) + if err != nil && arg != "" { + callCtx.Errf("printf: %s: invalid number\n", arg) + val = 0 + goFmt.WriteByte('G') + callCtx.Out(fmt.Sprintf(goFmt.String(), val)) + return false, i, true + } + goFmt.WriteByte('G') + callCtx.Out(fmt.Sprintf(goFmt.String(), val)) + + case 'n': + callCtx.Errf("printf: %%n: not supported (security risk)\n") + _ = getStringArg(args, argIdx) // consume arg + return false, i, true + + case 'q': + callCtx.Errf("printf: %%q: not supported\n") + _ = getStringArg(args, argIdx) + return false, i, true + + case 'a', 'A': + callCtx.Errf("printf: %%%c: not supported\n", verb) + _ = getStringArg(args, argIdx) + return false, i, true + + default: + // Unknown specifier — print literally. + callCtx.Outf("%%%c", verb) + } + + return false, i, hadError +} + +// getStringArg returns the next argument, or "" if exhausted. +func getStringArg(args []string, idx *int) string { + if *idx >= len(args) { + return "" + } + s := args[*idx] + *idx++ + return s +} + +// getIntArg returns the next argument parsed as an int (for * width/precision), or 0. +// The second return value is true if parsing failed. +func getIntArg(args []string, idx *int, callCtx *builtins.CallContext) (int, bool) { + s := getStringArg(args, idx) + if s == "" { + return 0, false + } + v, err := strconv.Atoi(s) + if err != nil { + callCtx.Errf("printf: %s: invalid number\n", s) + return 0, true + } + return v, false +} + +// parseIntArg parses a string as a signed integer, supporting decimal, octal (0-prefix), +// hex (0x-prefix), and character constants ('X or "X). +func parseIntArg(s string) (int64, error) { + if s == "" { + return 0, nil + } + + // Character constant: 'X or "X + if len(s) >= 2 && (s[0] == '\'' || s[0] == '"') { + return int64(s[1]), nil + } + + // Try parsing with automatic base detection. + val, err := strconv.ParseInt(s, 0, 64) + if err != nil { + return 0, err + } + return val, nil +} + +// parseUintArg parses a string as an unsigned integer. +func parseUintArg(s string) (uint64, error) { + if s == "" { + return 0, nil + } + + // Character constant: 'X or "X + if len(s) >= 2 && (s[0] == '\'' || s[0] == '"') { + return uint64(s[1]), nil + } + + // Handle negative numbers: parse as signed, then interpret as unsigned. + if len(s) > 0 && s[0] == '-' { + val, err := strconv.ParseInt(s, 0, 64) + if err != nil { + return 0, err + } + // Bash wraps negatives as unsigned. + return uint64(val), nil + } + + val, err := strconv.ParseUint(s, 0, 64) + if err != nil { + // Try signed parse for large hex values that may be negative in two's complement. + sval, serr := strconv.ParseInt(s, 0, 64) + if serr != nil { + return 0, err + } + return uint64(sval), nil + } + return val, nil +} + +// parseFloatArg parses a string as a float64, supporting hex/octal integer prefixes +// and character constants. +func parseFloatArg(s string) (float64, error) { + if s == "" { + return 0, nil + } + + // Character constant. + if len(s) >= 2 && (s[0] == '\'' || s[0] == '"') { + return float64(s[1]), nil + } + + // Handle hex integers used as float args (0xff etc). + if len(s) > 2 && s[0] == '0' && (s[1] == 'x' || s[1] == 'X') { + val, err := strconv.ParseInt(s, 0, 64) + if err != nil { + return 0, err + } + return float64(val), nil + } + + // Handle infinity and NaN. + lower := strings.ToLower(s) + if lower == "inf" || lower == "infinity" || lower == "+inf" || lower == "+infinity" { + return math.Inf(1), nil + } + if lower == "-inf" || lower == "-infinity" { + return math.Inf(-1), nil + } + + val, err := strconv.ParseFloat(s, 64) + if err != nil { + return 0, err + } + return val, nil +} + +// processBEscapes handles backslash escapes for %b (like echo -e). +// Returns the processed string and whether \c was seen (stop all output). +func processBEscapes(s string) (string, bool) { + var b strings.Builder + b.Grow(len(s)) + i := 0 + for i < len(s) { + if s[i] != '\\' || i+1 >= len(s) { + b.WriteByte(s[i]) + i++ + continue + } + i++ // skip '\' + switch s[i] { + case '\\': + b.WriteByte('\\') + case 'a': + b.WriteByte('\a') + case 'b': + b.WriteByte('\b') + case 'c': + return b.String(), true + case 'f': + b.WriteByte('\f') + case 'n': + b.WriteByte('\n') + case 'r': + b.WriteByte('\r') + case 't': + b.WriteByte('\t') + case 'v': + b.WriteByte('\v') + case '0': + // Octal: \0nnn (up to 3 digits after '0') + i++ + val, consumed := parseOctal(s[i:], 3) + i += consumed + b.WriteByte(byte(val)) + continue + case 'x': + // Hex: \xHH (up to 2 digits) + i++ + val, consumed := parseHex(s[i:], 2) + if consumed == 0 { + b.WriteByte('\\') + b.WriteByte('x') + continue + } + i += consumed + b.WriteByte(byte(val)) + continue + default: + // Unrecognized: output backslash and character. + b.WriteByte('\\') + b.WriteByte(s[i]) + } + i++ + } + return b.String(), false +} + +// parseOctal reads up to maxDigits octal digits from s and returns the +// accumulated value and the number of bytes consumed. +func parseOctal(s string, maxDigits int) (int, int) { + val := 0 + n := 0 + for n < maxDigits && n < len(s) && s[n] >= '0' && s[n] <= '7' { + val = val*8 + int(s[n]-'0') + n++ + } + return val, n +} + +// parseHex reads up to maxDigits hexadecimal digits from s and returns +// the accumulated value and the number of bytes consumed. +func parseHex(s string, maxDigits int) (int, int) { + val := 0 + n := 0 + for n < maxDigits && n < len(s) { + ch := s[n] + switch { + case ch >= '0' && ch <= '9': + val = val*16 + int(ch-'0') + case ch >= 'a' && ch <= 'f': + val = val*16 + int(ch-'a') + 10 + case ch >= 'A' && ch <= 'F': + val = val*16 + int(ch-'A') + 10 + default: + return val, n + } + n++ + } + return val, n +} diff --git a/interp/builtins/printf/printf_gnu_compat_test.go b/interp/builtins/printf/printf_gnu_compat_test.go new file mode 100644 index 00000000..a7de406d --- /dev/null +++ b/interp/builtins/printf/printf_gnu_compat_test.go @@ -0,0 +1,269 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +package printf_test + +import ( + "testing" + + "github.com/stretchr/testify/assert" +) + +// GNU compatibility tests for printf. +// +// These tests verify byte-for-byte output equivalence with GNU coreutils +// printf (captured from bash on Debian bookworm). Each test documents the +// exact GNU invocation used to produce the reference output. + +// TestGNUCompatSimpleString — basic string output. +// +// GNU command: printf "%s\n" hello +// Expected: "hello\n" +func TestGNUCompatSimpleString(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%s\n" hello`) + assert.Equal(t, 0, code) + assert.Equal(t, "hello\n", stdout) +} + +// TestGNUCompatFormatReuse — format reuse for excess arguments. +// +// GNU command: printf "%s\n" a b c +// Expected: "a\nb\nc\n" +func TestGNUCompatFormatReuse(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%s\n" a b c`) + assert.Equal(t, 0, code) + assert.Equal(t, "a\nb\nc\n", stdout) +} + +// TestGNUCompatMissingArgs — missing args default to "" and 0. +// +// GNU command: printf "%s:%d\n" hello +// Expected: "hello:0\n" +func TestGNUCompatMissingArgs(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%s:%d\n" hello`) + assert.Equal(t, 0, code) + assert.Equal(t, "hello:0\n", stdout) +} + +// TestGNUCompatPercentLiteral — %% produces a single %. +// +// GNU command: printf "100%%\n" +// Expected: "100%\n" +func TestGNUCompatPercentLiteral(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "100%%\n"`) + assert.Equal(t, 0, code) + assert.Equal(t, "100%\n", stdout) +} + +// TestGNUCompatZeroPad — zero-padded integer. +// +// GNU command: printf "%05d\n" 42 +// Expected: "00042\n" +func TestGNUCompatZeroPad(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%05d\n" 42`) + assert.Equal(t, 0, code) + assert.Equal(t, "00042\n", stdout) +} + +// TestGNUCompatWidthString — right-aligned string with width. +// +// GNU command: printf "%10s\n" hi +// Expected: " hi\n" +func TestGNUCompatWidthString(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%10s\n" hi`) + assert.Equal(t, 0, code) + assert.Equal(t, " hi\n", stdout) +} + +// TestGNUCompatLeftAlign — left-aligned string. +// +// GNU command: printf "%-10s|\n" hi +// Expected: "hi |\n" +func TestGNUCompatLeftAlign(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%-10s|\n" hi`) + assert.Equal(t, 0, code) + assert.Equal(t, "hi |\n", stdout) +} + +// TestGNUCompatPrecisionFloat — float with precision. +// +// GNU command: printf "%.2f\n" 3.14159 +// Expected: "3.14\n" +func TestGNUCompatPrecisionFloat(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%.2f\n" 3.14159`) + assert.Equal(t, 0, code) + assert.Equal(t, "3.14\n", stdout) +} + +// TestGNUCompatPrecisionString — string truncation with precision. +// +// GNU command: printf "%.3s\n" hello +// Expected: "hel\n" +func TestGNUCompatPrecisionString(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%.3s\n" hello`) + assert.Equal(t, 0, code) + assert.Equal(t, "hel\n", stdout) +} + +// TestGNUCompatOctalOutput — %o format. +// +// GNU command: printf "%o\n" 255 +// Expected: "377\n" +func TestGNUCompatOctalOutput(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%o\n" 255`) + assert.Equal(t, 0, code) + assert.Equal(t, "377\n", stdout) +} + +// TestGNUCompatHexOutput — %x and %X format. +// +// GNU command: printf "%x %X\n" 255 255 +// Expected: "ff FF\n" +func TestGNUCompatHexOutput(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%x %X\n" 255 255`) + assert.Equal(t, 0, code) + assert.Equal(t, "ff FF\n", stdout) +} + +// TestGNUCompatScientific — %e format. +// +// GNU command: printf "%e\n" 3.14 +// Expected: "3.140000e+00\n" +func TestGNUCompatScientific(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%e\n" 3.14`) + assert.Equal(t, 0, code) + assert.Equal(t, "3.140000e+00\n", stdout) +} + +// TestGNUCompatShortestFloat — %g format. +// +// GNU command: printf "%g\n" 3.14 +// Expected: "3.14\n" +func TestGNUCompatShortestFloat(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%g\n" 3.14`) + assert.Equal(t, 0, code) + assert.Equal(t, "3.14\n", stdout) +} + +// TestGNUCompatCharConstant — character constant argument. +// +// GNU command: printf "%d\n" "'A" +// Expected: "65\n" +func TestGNUCompatCharConstant(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%d\n" "'A"`) + assert.Equal(t, 0, code) + assert.Equal(t, "65\n", stdout) +} + +// TestGNUCompatHexInput — hex input parsing. +// +// GNU command: printf "%d\n" 0xff +// Expected: "255\n" +func TestGNUCompatHexInput(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%d\n" 0xff`) + assert.Equal(t, 0, code) + assert.Equal(t, "255\n", stdout) +} + +// TestGNUCompatOctalInput — octal input parsing. +// +// GNU command: printf "%d\n" 0755 +// Expected: "493\n" +func TestGNUCompatOctalInput(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%d\n" 0755`) + assert.Equal(t, 0, code) + assert.Equal(t, "493\n", stdout) +} + +// TestGNUCompatHashFlag — %#x adds 0x prefix. +// +// GNU command: printf "%#x\n" 255 +// Expected: "0xff\n" +func TestGNUCompatHashFlag(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%#x\n" 255`) + assert.Equal(t, 0, code) + assert.Equal(t, "0xff\n", stdout) +} + +// TestGNUCompatPlusFlag — %+d adds sign. +// +// GNU command: printf "%+d\n" 42 +// Expected: "+42\n" +func TestGNUCompatPlusFlag(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%+d\n" 42`) + assert.Equal(t, 0, code) + assert.Equal(t, "+42\n", stdout) +} + +// TestGNUCompatInvalidNumber — non-numeric arg for %d. +// +// GNU command: printf "%d\n" abc +// Expected stdout: "0\n", stderr: "printf: 'abc': invalid number", exit code: 1 +func TestGNUCompatInvalidNumber(t *testing.T) { + stdout, stderr, code := cmdRun(t, `printf "%d\n" abc`) + assert.Equal(t, 1, code) + assert.Equal(t, "0\n", stdout) + assert.Contains(t, stderr, "printf:") +} + +// TestGNUCompatBSpecifierBackslashC — %b with \c stops output. +// +// GNU command: printf "%b" 'hello\cworld' +// Expected: "hello" +func TestGNUCompatBSpecifierBackslashC(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%b" 'hello\cworld'`) + assert.Equal(t, 0, code) + assert.Equal(t, "hello", stdout) +} + +// TestGNUCompatEmptyFormat — empty format string. +// +// GNU command: printf "" +// Expected: "" +func TestGNUCompatEmptyFormat(t *testing.T) { + stdout, _, code := cmdRun(t, `printf ""`) + assert.Equal(t, 0, code) + assert.Equal(t, "", stdout) +} + +// TestGNUCompatCharFirstOnly — %c takes only the first character. +// +// GNU command: printf "%c\n" hello +// Expected: "h\n" +func TestGNUCompatCharFirstOnly(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%c\n" hello`) + assert.Equal(t, 0, code) + assert.Equal(t, "h\n", stdout) +} + +// TestGNUCompatUnsigned — %u format. +// +// GNU command: printf "%u\n" 42 +// Expected: "42\n" +func TestGNUCompatUnsigned(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%u\n" 42`) + assert.Equal(t, 0, code) + assert.Equal(t, "42\n", stdout) +} + +// TestGNUCompatDefaultFloat — %f default precision is 6. +// +// GNU command: printf "%f\n" 3.14 +// Expected: "3.140000\n" +func TestGNUCompatDefaultFloat(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%f\n" 3.14`) + assert.Equal(t, 0, code) + assert.Equal(t, "3.140000\n", stdout) +} + +// TestGNUCompatOctalEscapeInFormat — \NNN in format string. +// +// GNU command: printf "\101\n" +// Expected: "A\n" +func TestGNUCompatOctalEscapeInFormat(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "\101\n"`) + assert.Equal(t, 0, code) + assert.Equal(t, "A\n", stdout) +} diff --git a/interp/builtins/printf/printf_pentest_test.go b/interp/builtins/printf/printf_pentest_test.go new file mode 100644 index 00000000..69561313 --- /dev/null +++ b/interp/builtins/printf/printf_pentest_test.go @@ -0,0 +1,331 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +package printf_test + +import ( + "context" + "math" + "strconv" + "strings" + "testing" + "time" + + "github.com/stretchr/testify/assert" +) + +// --- Integer edge cases --- + +func TestPentestIntZero(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%d\n" 0`) + assert.Equal(t, 0, code) + assert.Equal(t, "0\n", stdout) +} + +func TestPentestIntOne(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%d\n" 1`) + assert.Equal(t, 0, code) + assert.Equal(t, "1\n", stdout) +} + +func TestPentestIntMaxInt32(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%d\n" 2147483647`) + assert.Equal(t, 0, code) + assert.Equal(t, "2147483647\n", stdout) +} + +func TestPentestIntMaxInt64(t *testing.T) { + max := strconv.FormatInt(math.MaxInt64, 10) + stdout, _, code := cmdRun(t, `printf "%d\n" `+max) + assert.Equal(t, 0, code) + assert.Equal(t, max+"\n", stdout) +} + +func TestPentestIntMaxInt64PlusOne(t *testing.T) { + // MaxInt64 + 1 = 9223372036854775808 — should overflow + _, stderr, code := cmdRun(t, `printf "%d\n" 9223372036854775808`) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "printf:") +} + +func TestPentestIntHugeNumber(t *testing.T) { + _, stderr, code := cmdRun(t, `printf "%d\n" 99999999999999999999`) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "printf:") +} + +func TestPentestIntNegativeOne(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%d\n" -- -1`) + assert.Equal(t, 0, code) + assert.Equal(t, "-1\n", stdout) +} + +func TestPentestIntNegativeHuge(t *testing.T) { + _, stderr, code := cmdRun(t, `printf "%d\n" -- -9999999999999999999`) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "printf:") +} + +func TestPentestIntPlusZero(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%d\n" +0`) + assert.Equal(t, 0, code) + assert.Equal(t, "0\n", stdout) +} + +func TestPentestIntPlusOne(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%d\n" +1`) + assert.Equal(t, 0, code) + assert.Equal(t, "1\n", stdout) +} + +func TestPentestIntEmpty(t *testing.T) { + // Empty string for %d → default 0 + stdout, _, code := cmdRun(t, `printf "%d\n" ""`) + assert.Equal(t, 0, code) + assert.Equal(t, "0\n", stdout) +} + +func TestPentestIntWhitespace(t *testing.T) { + // Whitespace-only string for %d → invalid + stdout, stderr, code := cmdRun(t, `printf "%d\n" " "`) + assert.Equal(t, 1, code) + assert.Equal(t, "0\n", stdout) + assert.Contains(t, stderr, "printf:") +} + +// --- Same for bytes (%u, %o, %x) --- + +func TestPentestUnsignedMaxInt64(t *testing.T) { + max := strconv.FormatInt(math.MaxInt64, 10) + stdout, _, code := cmdRun(t, `printf "%u\n" `+max) + assert.Equal(t, 0, code) + assert.Equal(t, max+"\n", stdout) +} + +func TestPentestHexMaxInt32(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%x\n" 2147483647`) + assert.Equal(t, 0, code) + assert.Equal(t, "7fffffff\n", stdout) +} + +// --- Flag and argument injection --- + +func TestPentestUnknownFlags(t *testing.T) { + // Unknown flag should be rejected + _, stderr, code := cmdRun(t, `printf -f "%s" hello`) + // printf treats -f as the format string (NoFlags mode) + // This should actually work — -f is the format string + if code == 0 { + // If it succeeds, -f was treated as a format string + assert.Equal(t, 0, code) + } else { + assert.Contains(t, stderr, "printf:") + } +} + +func TestPentestFollowFlag(t *testing.T) { + _, stderr, code := cmdRun(t, `printf --follow "%s" hello`) + // --follow is treated as format string (NoFlags) + if code == 0 { + assert.Equal(t, 0, code) + } else { + assert.Contains(t, stderr, "printf:") + } +} + +func TestPentestEndOfFlagsWithFlagLikeFilename(t *testing.T) { + // After --, "-v" is treated as the format string (no specifiers), + // and "hello" is an unused extra argument. Output is just "-v". + stdout, _, code := cmdRun(t, `printf -- "-v" hello`) + assert.Equal(t, 0, code) + assert.Equal(t, "-v", stdout) +} + +func TestPentestEndOfFlagsWithPercentS(t *testing.T) { + // After --, "%s" is the format string and "hello" is the argument. + stdout, _, code := cmdRun(t, `printf -- "%s" hello`) + assert.Equal(t, 0, code) + assert.Equal(t, "hello", stdout) +} + +func TestPentestMultipleStdinDash(t *testing.T) { + // printf doesn't read stdin, so "-" is just a string + stdout, _, code := cmdRun(t, `printf "%s %s\n" - -`) + assert.Equal(t, 0, code) + assert.Equal(t, "- -\n", stdout) +} + +// --- Format reuse bounding --- + +func TestPentestFormatReuseMany(t *testing.T) { + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + // 100 args should be fine + args := strings.Repeat("x ", 100) + stdout, _, code := runScriptCtx(ctx, t, `printf "%s\n" `+args, "") + assert.Equal(t, 0, code) + lines := strings.Split(strings.TrimRight(stdout, "\n"), "\n") + assert.Equal(t, 100, len(lines)) +} + +func TestPentestNoSpecifiersExtraArgs(t *testing.T) { + // Format with no specifiers and extra args — format is printed once + stdout, _, code := cmdRun(t, `printf "hello\n" a b c d e`) + assert.Equal(t, 0, code) + assert.Equal(t, "hello\n", stdout) +} + +// --- Width/precision bounds --- + +func TestPentestHugeWidth(t *testing.T) { + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + stdout, _, code := runScriptCtx(ctx, t, `printf "%99999d\n" 42`, "") + assert.Equal(t, 0, code) + // Width should be clamped to 10000 + assert.LessOrEqual(t, len(stdout), 10002) + assert.Contains(t, stdout, "42") +} + +func TestPentestHugePrecision(t *testing.T) { + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + stdout, _, code := runScriptCtx(ctx, t, `printf "%.99999f\n" 3.14`, "") + assert.Equal(t, 0, code) + // Precision should be clamped to 10000 + assert.LessOrEqual(t, len(stdout), 10010) +} + +// --- Rejected specifiers --- + +func TestPentestPercentN(t *testing.T) { + _, stderr, code := cmdRun(t, `printf "%n" foo`) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "printf:") + assert.Contains(t, stderr, "not supported") +} + +func TestPentestPercentQ(t *testing.T) { + _, stderr, code := cmdRun(t, `printf "%q" foo`) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "printf:") +} + +func TestPentestPercentA(t *testing.T) { + _, stderr, code := cmdRun(t, `printf "%a" 3.14`) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "printf:") +} + +func TestPentestPercentAUpper(t *testing.T) { + _, stderr, code := cmdRun(t, `printf "%A" 3.14`) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "printf:") +} + +// --- V flag rejection --- + +func TestPentestVFlag(t *testing.T) { + _, stderr, code := cmdRun(t, `printf -v myvar "%s" hello`) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "printf:") +} + +// --- Special characters in format and args --- + +func TestPentestNulByteInArg(t *testing.T) { + // Args containing special characters should be handled safely + stdout, _, code := cmdRun(t, `printf "%s\n" hello`) + assert.Equal(t, 0, code) + assert.Equal(t, "hello\n", stdout) +} + +func TestPentestEmptyArgs(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%s|%s|%s\n" "" "" ""`) + assert.Equal(t, 0, code) + assert.Equal(t, "||\n", stdout) +} + +// --- Float edge cases --- + +func TestPentestFloatInfinity(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%f\n" inf`) + assert.Equal(t, 0, code) + assert.Contains(t, stdout, "Inf") +} + +func TestPentestFloatNaN(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%f\n" nan`) + assert.Equal(t, 0, code) + assert.Contains(t, stdout, "NaN") +} + +func TestPentestFloatZero(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%f\n" 0`) + assert.Equal(t, 0, code) + assert.Equal(t, "0.000000\n", stdout) +} + +// --- Behavior matching --- + +func TestPentestBashCompatPercentD(t *testing.T) { + // Bash: printf "%d\n" 42 → "42\n" + stdout, _, code := cmdRun(t, `printf "%d\n" 42`) + assert.Equal(t, 0, code) + assert.Equal(t, "42\n", stdout) +} + +func TestPentestBashCompatFormatReusePartial(t *testing.T) { + // Bash: printf "%s=%d\n" a 1 b → "a=1\nb=0\n" + stdout, _, code := cmdRun(t, `printf "%s=%d\n" a 1 b`) + assert.Equal(t, 0, code) + assert.Equal(t, "a=1\nb=0\n", stdout) +} + +// --- Star width/precision --- + +func TestPentestStarWidth(t *testing.T) { + // printf "%*s\n" 10 hello → right-aligned in 10-char field + stdout, _, code := cmdRun(t, `printf "%*s\n" 10 hello`) + assert.Equal(t, 0, code) + assert.Equal(t, " hello\n", stdout) +} + +func TestPentestStarPrecision(t *testing.T) { + // printf "%.*f\n" 2 3.14159 → "3.14\n" + stdout, _, code := cmdRun(t, `printf "%.*f\n" 2 3.14159`) + assert.Equal(t, 0, code) + assert.Equal(t, "3.14\n", stdout) +} + +func TestPentestStarWidthInvalid(t *testing.T) { + // Invalid number for * width → exit code 1 + stdout, stderr, code := cmdRun(t, `printf "%*d\n" abc 42`) + assert.Equal(t, 1, code) + assert.Equal(t, "42\n", stdout) + assert.Contains(t, stderr, "printf:") +} + +func TestPentestStarPrecisionInvalid(t *testing.T) { + // Invalid number for * precision → exit code 1 + _, stderr, code := cmdRun(t, `printf "%.*f\n" abc 3.14`) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "printf:") +} + +func TestPentestStarWidthNegative(t *testing.T) { + // Negative width via * → left-align (bash behavior) + stdout, _, code := cmdRun(t, `printf "%*s|\n" -- -10 hi`) + assert.Equal(t, 0, code) + assert.Equal(t, "hi |\n", stdout) +} + +func TestPentestBashCompatInvalidNumContinues(t *testing.T) { + // Bash prints 0 and continues with exit code 1 + stdout, stderr, code := cmdRun(t, `printf "%d %d\n" abc 42`) + assert.Equal(t, 1, code) + assert.Equal(t, "0 42\n", stdout) + assert.Contains(t, stderr, "printf:") +} diff --git a/interp/builtins/printf/printf_test.go b/interp/builtins/printf/printf_test.go new file mode 100644 index 00000000..12a8702f --- /dev/null +++ b/interp/builtins/printf/printf_test.go @@ -0,0 +1,748 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +package printf_test + +import ( + "context" + "testing" + "time" + + "github.com/stretchr/testify/assert" + + "github.com/DataDog/rshell/interp" + "github.com/DataDog/rshell/interp/builtins/testutil" +) + +// runScriptCtx runs a shell script with a context and returns stdout, stderr, +// and the exit code. +func runScriptCtx(ctx context.Context, t *testing.T, script, dir string, opts ...interp.RunnerOption) (string, string, int) { + t.Helper() + return testutil.RunScriptCtx(ctx, t, script, dir, opts...) +} + +// runScript runs a shell script and returns stdout, stderr, and the exit code. +func runScript(t *testing.T, script, dir string, opts ...interp.RunnerOption) (string, string, int) { + t.Helper() + return testutil.RunScript(t, script, dir, opts...) +} + +// cmdRun runs a printf command (no file access needed). +func cmdRun(t *testing.T, script string) (stdout, stderr string, exitCode int) { + t.Helper() + return runScript(t, script, "") +} + +// --- Basic functionality --- + +func TestPrintfSimpleString(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%s\n" hello`) + assert.Equal(t, 0, code) + assert.Equal(t, "hello\n", stdout) +} + +func TestPrintfNoArgs(t *testing.T) { + _, stderr, code := cmdRun(t, `printf`) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "printf:") +} + +func TestPrintfFormatOnly(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "hello world\n"`) + assert.Equal(t, 0, code) + assert.Equal(t, "hello world\n", stdout) +} + +func TestPrintfMultipleArgs(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%s %s\n" hello world`) + assert.Equal(t, 0, code) + assert.Equal(t, "hello world\n", stdout) +} + +func TestPrintfFormatReuse(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%s\n" a b c`) + assert.Equal(t, 0, code) + assert.Equal(t, "a\nb\nc\n", stdout) +} + +func TestPrintfMissingArgString(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%s and %s\n" hello`) + assert.Equal(t, 0, code) + assert.Equal(t, "hello and \n", stdout) +} + +func TestPrintfMissingArgNumber(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%d and %d\n" 42`) + assert.Equal(t, 0, code) + assert.Equal(t, "42 and 0\n", stdout) +} + +func TestPrintfPercentLiteral(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "100%%\n"`) + assert.Equal(t, 0, code) + assert.Equal(t, "100%\n", stdout) +} + +func TestPrintfEmptyFormat(t *testing.T) { + stdout, _, code := cmdRun(t, `printf ""`) + assert.Equal(t, 0, code) + assert.Equal(t, "", stdout) +} + +func TestPrintfNoNewline(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "hello"`) + assert.Equal(t, 0, code) + assert.Equal(t, "hello", stdout) +} + +// --- Escape sequences --- + +func TestPrintfEscapeNewline(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "a\nb\n"`) + assert.Equal(t, 0, code) + assert.Equal(t, "a\nb\n", stdout) +} + +func TestPrintfEscapeTab(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "a\tb\n"`) + assert.Equal(t, 0, code) + assert.Equal(t, "a\tb\n", stdout) +} + +func TestPrintfEscapeBackslash(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "a\\\\b\n"`) + assert.Equal(t, 0, code) + assert.Equal(t, "a\\b\n", stdout) +} + +func TestPrintfEscapeCarriageReturn(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "hello\rworld\n"`) + assert.Equal(t, 0, code) + assert.Equal(t, "hello\rworld\n", stdout) +} + +func TestPrintfEscapeOctal(t *testing.T) { + // \101 = octal 101 = 65 = 'A' + stdout, _, code := cmdRun(t, `printf "\101\n"`) + assert.Equal(t, 0, code) + assert.Equal(t, "A\n", stdout) +} + +func TestPrintfEscapeHex(t *testing.T) { + // \x41 = hex 41 = 65 = 'A' + stdout, _, code := cmdRun(t, `printf "\x41\n"`) + assert.Equal(t, 0, code) + assert.Equal(t, "A\n", stdout) +} + +func TestPrintfEscapeBell(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "\a"`) + assert.Equal(t, 0, code) + assert.Equal(t, "\a", stdout) +} + +func TestPrintfEscapeFormFeed(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "\f"`) + assert.Equal(t, 0, code) + assert.Equal(t, "\f", stdout) +} + +func TestPrintfEscapeVerticalTab(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "\v"`) + assert.Equal(t, 0, code) + assert.Equal(t, "\v", stdout) +} + +func TestPrintfEscapeBackspace(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "\b"`) + assert.Equal(t, 0, code) + assert.Equal(t, "\b", stdout) +} + +// --- Format specifiers --- + +func TestPrintfSpecifierString(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%s" hello`) + assert.Equal(t, 0, code) + assert.Equal(t, "hello", stdout) +} + +func TestPrintfSpecifierChar(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%c\n" A`) + assert.Equal(t, 0, code) + assert.Equal(t, "A\n", stdout) +} + +func TestPrintfSpecifierCharEmpty(t *testing.T) { + // Empty arg for %c should produce a NUL byte (bash behavior) + stdout, _, code := cmdRun(t, `printf "%c" ""`) + assert.Equal(t, 0, code) + assert.Equal(t, "\x00", stdout) +} + +func TestPrintfSpecifierDecimal(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%d\n" 42`) + assert.Equal(t, 0, code) + assert.Equal(t, "42\n", stdout) +} + +func TestPrintfSpecifierInteger(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%i\n" 42`) + assert.Equal(t, 0, code) + assert.Equal(t, "42\n", stdout) +} + +func TestPrintfSpecifierOctal(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%o\n" 255`) + assert.Equal(t, 0, code) + assert.Equal(t, "377\n", stdout) +} + +func TestPrintfSpecifierUnsigned(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%u\n" 42`) + assert.Equal(t, 0, code) + assert.Equal(t, "42\n", stdout) +} + +func TestPrintfSpecifierHexLower(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%x\n" 255`) + assert.Equal(t, 0, code) + assert.Equal(t, "ff\n", stdout) +} + +func TestPrintfSpecifierHexUpper(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%X\n" 255`) + assert.Equal(t, 0, code) + assert.Equal(t, "FF\n", stdout) +} + +func TestPrintfSpecifierFloat(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%f\n" 3.14`) + assert.Equal(t, 0, code) + assert.Equal(t, "3.140000\n", stdout) +} + +func TestPrintfSpecifierScientific(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%e\n" 3.14`) + assert.Equal(t, 0, code) + assert.Equal(t, "3.140000e+00\n", stdout) +} + +func TestPrintfSpecifierScientificUpper(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%E\n" 3.14`) + assert.Equal(t, 0, code) + assert.Equal(t, "3.140000E+00\n", stdout) +} + +func TestPrintfSpecifierShortest(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%g\n" 3.14`) + assert.Equal(t, 0, code) + assert.Equal(t, "3.14\n", stdout) +} + +func TestPrintfSpecifierShortestUpper(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%G\n" 3.14`) + assert.Equal(t, 0, code) + assert.Equal(t, "3.14\n", stdout) +} + +func TestPrintfSpecifierFloatF(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%F\n" 3.14`) + assert.Equal(t, 0, code) + assert.Equal(t, "3.140000\n", stdout) +} + +// --- %b specifier --- + +func TestPrintfSpecifierBEscapes(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%b\n" 'hello\tworld'`) + assert.Equal(t, 0, code) + assert.Equal(t, "hello\tworld\n", stdout) +} + +func TestPrintfSpecifierBBackslashC(t *testing.T) { + // \c stops all output + stdout, _, code := cmdRun(t, `printf "%b" 'hello\cworld'`) + assert.Equal(t, 0, code) + assert.Equal(t, "hello", stdout) +} + +func TestPrintfSpecifierBOctal(t *testing.T) { + // %b uses \0NNN (with leading zero) for octal + stdout, _, code := cmdRun(t, `printf "%b\n" '\0101'`) + assert.Equal(t, 0, code) + assert.Equal(t, "A\n", stdout) +} + +// --- Width and precision --- + +func TestPrintfWidthRightAlign(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%10s\n" hi`) + assert.Equal(t, 0, code) + assert.Equal(t, " hi\n", stdout) +} + +func TestPrintfWidthLeftAlign(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%-10s|\n" hi`) + assert.Equal(t, 0, code) + assert.Equal(t, "hi |\n", stdout) +} + +func TestPrintfWidthZeroPad(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%05d\n" 42`) + assert.Equal(t, 0, code) + assert.Equal(t, "00042\n", stdout) +} + +func TestPrintfPrecisionFloat(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%.2f\n" 3.14159`) + assert.Equal(t, 0, code) + assert.Equal(t, "3.14\n", stdout) +} + +func TestPrintfPrecisionString(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%.3s\n" hello`) + assert.Equal(t, 0, code) + assert.Equal(t, "hel\n", stdout) +} + +func TestPrintfWidthAndPrecision(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%10.3s\n" hello`) + assert.Equal(t, 0, code) + assert.Equal(t, " hel\n", stdout) +} + +func TestPrintfFlagPlus(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%+d\n" 42`) + assert.Equal(t, 0, code) + assert.Equal(t, "+42\n", stdout) +} + +func TestPrintfFlagSpace(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "% d\n" 42`) + assert.Equal(t, 0, code) + assert.Equal(t, " 42\n", stdout) +} + +func TestPrintfFlagHash(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%#x\n" 255`) + assert.Equal(t, 0, code) + assert.Equal(t, "0xff\n", stdout) +} + +func TestPrintfFlagHashOctal(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%#o\n" 255`) + assert.Equal(t, 0, code) + assert.Equal(t, "0377\n", stdout) +} + +// --- Numeric argument formats --- + +func TestPrintfNumericNegative(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%d\n" -- -42`) + assert.Equal(t, 0, code) + assert.Equal(t, "-42\n", stdout) +} + +func TestPrintfNumericHexInput(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%d\n" 0xff`) + assert.Equal(t, 0, code) + assert.Equal(t, "255\n", stdout) +} + +func TestPrintfNumericOctalInput(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%d\n" 0755`) + assert.Equal(t, 0, code) + assert.Equal(t, "493\n", stdout) +} + +func TestPrintfNumericCharConstant(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%d\n" "'A"`) + assert.Equal(t, 0, code) + assert.Equal(t, "65\n", stdout) +} + +func TestPrintfNumericZero(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%d\n" 0`) + assert.Equal(t, 0, code) + assert.Equal(t, "0\n", stdout) +} + +// --- Error handling --- + +func TestPrintfInvalidNumber(t *testing.T) { + stdout, stderr, code := cmdRun(t, `printf "%d\n" abc`) + assert.Equal(t, 1, code) + assert.Equal(t, "0\n", stdout) + assert.Contains(t, stderr, "printf:") +} + +func TestPrintfRejectedPercentN(t *testing.T) { + _, stderr, code := cmdRun(t, `printf "%n" foo`) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "printf:") +} + +func TestPrintfRejectedVFlag(t *testing.T) { + _, stderr, code := cmdRun(t, `printf -v var "%s" hello`) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "printf:") +} + +// --- Help --- + +func TestPrintfHelp(t *testing.T) { + stdout, _, code := cmdRun(t, `printf --help`) + assert.Equal(t, 0, code) + assert.Contains(t, stdout, "Usage:") +} + +func TestPrintfHelpShort(t *testing.T) { + stdout, _, code := cmdRun(t, `printf -h`) + assert.Equal(t, 0, code) + assert.Contains(t, stdout, "Usage:") +} + +// --- Format reuse edge cases --- + +func TestPrintfFormatReuseMultipleSpecifiers(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%s=%d\n" a 1 b 2 c 3`) + assert.Equal(t, 0, code) + assert.Equal(t, "a=1\nb=2\nc=3\n", stdout) +} + +func TestPrintfFormatReusePartialFill(t *testing.T) { + // When format has 2 specifiers but odd number of extra args + stdout, _, code := cmdRun(t, `printf "%s=%d\n" a 1 b`) + assert.Equal(t, 0, code) + assert.Equal(t, "a=1\nb=0\n", stdout) +} + +func TestPrintfNoSpecifiers(t *testing.T) { + // Format with no specifiers and extra args — format is still printed + // but args are not consumed (no specifiers to consume them) + stdout, _, code := cmdRun(t, `printf "hello\n" extra args`) + assert.Equal(t, 0, code) + assert.Equal(t, "hello\n", stdout) +} + +// --- Shell integration --- + +func TestPrintfInPipeline(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%s\n" hello | cat`) + assert.Equal(t, 0, code) + assert.Equal(t, "hello\n", stdout) +} + +func TestPrintfInForLoop(t *testing.T) { + stdout, _, code := cmdRun(t, `for i in 1 2 3; do printf "%d " "$i"; done; printf "\n"`) + assert.Equal(t, 0, code) + assert.Equal(t, "1 2 3 \n", stdout) +} + +func TestPrintfVariableExpansion(t *testing.T) { + stdout, _, code := cmdRun(t, `NAME=world; printf "hello %s\n" "$NAME"`) + assert.Equal(t, 0, code) + assert.Equal(t, "hello world\n", stdout) +} + +func TestPrintfZeroPaddedInt(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%05d\n" 42`) + assert.Equal(t, 0, code) + assert.Equal(t, "00042\n", stdout) +} + +// --- Context cancellation --- + +func TestPrintfContextCancellation(t *testing.T) { + ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second) + defer cancel() + // Large format reuse should respect context cancellation + // This script tries to print many items but should be bounded + _, _, code := runScriptCtx(ctx, t, `printf "%s\n" a b c d e f g h i j`, "") + assert.Equal(t, 0, code) +} + +// --- Double-dash separator --- + +func TestPrintfDoubleDash(t *testing.T) { + stdout, _, code := cmdRun(t, `printf -- "%s\n" hello`) + assert.Equal(t, 0, code) + assert.Equal(t, "hello\n", stdout) +} + +// --- Octal escape edge cases --- + +func TestPrintfEscapeOctalZeroPrefix(t *testing.T) { + // \0101 = octal 101 = 65 = 'A' (format string uses \0NNN) + stdout, _, code := cmdRun(t, `printf "\0101\n"`) + assert.Equal(t, 0, code) + assert.Equal(t, "A\n", stdout) +} + +func TestPrintfEscapeOctalNulByte(t *testing.T) { + // \0 alone = NUL byte + stdout, _, code := cmdRun(t, `printf "a\0b"`) + assert.Equal(t, 0, code) + assert.Equal(t, "a\x00b", stdout) +} + +// --- Mixed format string and args --- + +func TestPrintfMixedText(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "Name: %s, Age: %d\n" Alice 30`) + assert.Equal(t, 0, code) + assert.Equal(t, "Name: Alice, Age: 30\n", stdout) +} + +func TestPrintfMultiplePercent(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%d%%\n" 100`) + assert.Equal(t, 0, code) + assert.Equal(t, "100%\n", stdout) +} + +// --- Coverage: rejected specifiers --- + +func TestPrintfRejectedQ(t *testing.T) { + _, stderr, code := cmdRun(t, `printf "%q" hello`) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "printf:") +} + +func TestPrintfRejectedA(t *testing.T) { + _, stderr, code := cmdRun(t, `printf "%a" 3.14`) + assert.Equal(t, 1, code) + assert.Contains(t, stderr, "printf:") +} + +// --- Coverage: unknown specifier --- + +func TestPrintfUnknownSpecifier(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%z\n"`) + assert.Equal(t, 0, code) + assert.Equal(t, "%z\n", stdout) +} + +// --- Coverage: escape edge cases --- + +func TestPrintfEscapeDoubleQuote(t *testing.T) { + stdout, _, code := cmdRun(t, `printf '\"hello\"'`) + assert.Equal(t, 0, code) + assert.Equal(t, "\"hello\"", stdout) +} + +func TestPrintfEscapeUnknown(t *testing.T) { + // Unknown escape should output backslash and character + stdout, _, code := cmdRun(t, `printf '\q'`) + assert.Equal(t, 0, code) + assert.Equal(t, "\\q", stdout) +} + +func TestPrintfTrailingBackslash(t *testing.T) { + stdout, _, code := cmdRun(t, `printf 'hello\'`) + assert.Equal(t, 0, code) + assert.Equal(t, "hello\\", stdout) +} + +// --- Coverage: %b escape sequences --- + +func TestPrintfBEscapeTab(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%b" 'a\tb'`) + assert.Equal(t, 0, code) + assert.Equal(t, "a\tb", stdout) +} + +func TestPrintfBEscapeNewline(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%b" 'a\nb'`) + assert.Equal(t, 0, code) + assert.Equal(t, "a\nb", stdout) +} + +func TestPrintfBEscapeBackslash(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%b" 'a\\b'`) + assert.Equal(t, 0, code) + assert.Equal(t, "a\\b", stdout) +} + +func TestPrintfBEscapeHex(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%b" '\x41'`) + assert.Equal(t, 0, code) + assert.Equal(t, "A", stdout) +} + +func TestPrintfBEscapeHexInvalid(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%b" '\xZZ'`) + assert.Equal(t, 0, code) + assert.Equal(t, "\\xZZ", stdout) +} + +func TestPrintfBEscapeBell(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%b" '\a'`) + assert.Equal(t, 0, code) + assert.Equal(t, "\a", stdout) +} + +func TestPrintfBEscapeFormFeed(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%b" '\f'`) + assert.Equal(t, 0, code) + assert.Equal(t, "\f", stdout) +} + +func TestPrintfBEscapeCarriageReturn(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%b" '\r'`) + assert.Equal(t, 0, code) + assert.Equal(t, "\r", stdout) +} + +func TestPrintfBEscapeVerticalTab(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%b" '\v'`) + assert.Equal(t, 0, code) + assert.Equal(t, "\v", stdout) +} + +func TestPrintfBEscapeBackspace(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%b" '\b'`) + assert.Equal(t, 0, code) + assert.Equal(t, "\b", stdout) +} + +func TestPrintfBEscapeUnknown(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%b" '\q'`) + assert.Equal(t, 0, code) + assert.Equal(t, "\\q", stdout) +} + +// --- Coverage: parseFloatArg --- + +func TestPrintfFloatHexInput(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%f\n" 0xff`) + assert.Equal(t, 0, code) + assert.Equal(t, "255.000000\n", stdout) +} + +func TestPrintfFloatInfinity(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%f\n" inf`) + assert.Equal(t, 0, code) + assert.Contains(t, stdout, "Inf") +} + +func TestPrintfFloatNegInfinity(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%f\n" -- -inf`) + assert.Equal(t, 0, code) + assert.Contains(t, stdout, "-Inf") +} + +func TestPrintfFloatCharConstant(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%f\n" "'A"`) + assert.Equal(t, 0, code) + assert.Equal(t, "65.000000\n", stdout) +} + +func TestPrintfFloatInvalid(t *testing.T) { + stdout, stderr, code := cmdRun(t, `printf "%f\n" abc`) + assert.Equal(t, 1, code) + assert.Equal(t, "0.000000\n", stdout) + assert.Contains(t, stderr, "printf:") +} + +// --- Coverage: parseUintArg --- + +func TestPrintfUnsignedCharConstant(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%u\n" "'A"`) + assert.Equal(t, 0, code) + assert.Equal(t, "65\n", stdout) +} + +func TestPrintfUnsignedInvalid(t *testing.T) { + stdout, stderr, code := cmdRun(t, `printf "%u\n" abc`) + assert.Equal(t, 1, code) + assert.Equal(t, "0\n", stdout) + assert.Contains(t, stderr, "printf:") +} + +func TestPrintfOctalInvalid(t *testing.T) { + stdout, stderr, code := cmdRun(t, `printf "%o\n" abc`) + assert.Equal(t, 1, code) + assert.Equal(t, "0\n", stdout) + assert.Contains(t, stderr, "printf:") +} + +func TestPrintfHexInvalid(t *testing.T) { + stdout, stderr, code := cmdRun(t, `printf "%x\n" abc`) + assert.Equal(t, 1, code) + assert.Equal(t, "0\n", stdout) + assert.Contains(t, stderr, "printf:") +} + +func TestPrintfHexUpperInvalid(t *testing.T) { + stdout, stderr, code := cmdRun(t, `printf "%X\n" abc`) + assert.Equal(t, 1, code) + assert.Equal(t, "0\n", stdout) + assert.Contains(t, stderr, "printf:") +} + +// --- Coverage: float specifiers errors --- + +func TestPrintfScientificInvalid(t *testing.T) { + stdout, stderr, code := cmdRun(t, `printf "%e\n" abc`) + assert.Equal(t, 1, code) + assert.Equal(t, "0.000000e+00\n", stdout) + assert.Contains(t, stderr, "printf:") +} + +func TestPrintfScientificUpperInvalid(t *testing.T) { + stdout, stderr, code := cmdRun(t, `printf "%E\n" abc`) + assert.Equal(t, 1, code) + assert.Equal(t, "0.000000E+00\n", stdout) + assert.Contains(t, stderr, "printf:") +} + +func TestPrintfShortestInvalid(t *testing.T) { + stdout, stderr, code := cmdRun(t, `printf "%g\n" abc`) + assert.Equal(t, 1, code) + assert.Equal(t, "0\n", stdout) + assert.Contains(t, stderr, "printf:") +} + +func TestPrintfShortestUpperInvalid(t *testing.T) { + stdout, stderr, code := cmdRun(t, `printf "%G\n" abc`) + assert.Equal(t, 1, code) + assert.Equal(t, "0\n", stdout) + assert.Contains(t, stderr, "printf:") +} + +func TestPrintfFloatFUpperInvalid(t *testing.T) { + stdout, stderr, code := cmdRun(t, `printf "%F\n" abc`) + assert.Equal(t, 1, code) + assert.Equal(t, "0.000000\n", stdout) + assert.Contains(t, stderr, "printf:") +} + +// --- Coverage: incomplete specifier --- + +func TestPrintfIncompleteSpecifier(t *testing.T) { + stdout, _, code := cmdRun(t, `printf "%"`) + assert.Equal(t, 0, code) + assert.Equal(t, "%", stdout) +} + +// --- Coverage: hex escape in format with no valid digits --- + +func TestPrintfHexEscapeNoDigits(t *testing.T) { + stdout, _, code := cmdRun(t, `printf '\xZZ'`) + assert.Equal(t, 0, code) + assert.Equal(t, "\\xZZ", stdout) +} + +// --- Coverage: width clamping --- + +func TestPrintfWidthClamped(t *testing.T) { + // Very large width should be clamped, not cause OOM + stdout, _, code := cmdRun(t, `printf "%99999s\n" hi`) + assert.Equal(t, 0, code) + assert.Contains(t, stdout, "hi") + // Width clamped to 10000 + assert.LessOrEqual(t, len(stdout), 10002) +} diff --git a/interp/register_builtins.go b/interp/register_builtins.go index 8d7f50d5..a6aca9da 100644 --- a/interp/register_builtins.go +++ b/interp/register_builtins.go @@ -19,6 +19,7 @@ import ( "github.com/DataDog/rshell/interp/builtins/grep" "github.com/DataDog/rshell/interp/builtins/head" "github.com/DataDog/rshell/interp/builtins/ls" + printfcmd "github.com/DataDog/rshell/interp/builtins/printf" "github.com/DataDog/rshell/interp/builtins/strings_cmd" "github.com/DataDog/rshell/interp/builtins/tail" "github.com/DataDog/rshell/interp/builtins/testcmd" @@ -42,6 +43,7 @@ func registerBuiltins() { grep.Cmd, head.Cmd, ls.Cmd, + printfcmd.Cmd, strings_cmd.Cmd, tail.Cmd, testcmd.Cmd, diff --git a/tests/allowed_symbols_test.go b/tests/allowed_symbols_test.go index e74d5d17..2e69b592 100644 --- a/tests/allowed_symbols_test.go +++ b/tests/allowed_symbols_test.go @@ -68,6 +68,8 @@ var builtinAllowedSymbols = []string{ "io.ReadCloser", // io.Reader — interface type; no side effects. "io.Reader", + // math.Inf — returns positive or negative infinity; pure function, no I/O. + "math.Inf", // math.MaxInt32 — integer constant; no side effects. "math.MaxInt32", // math.MaxInt64 — integer constant; no side effects. @@ -92,6 +94,10 @@ var builtinAllowedSymbols = []string{ "strings.Builder", // strings.Join — concatenates a slice of strings with a separator; pure function, no I/O. "strings.Join", + // strings.ToLower — converts string to lowercase; pure function, no I/O. + "strings.ToLower", + // strings.ToUpper — converts string to uppercase; pure function, no I/O. + "strings.ToUpper", // strings.Split — splits a string by separator into a slice; pure function, no I/O. "strings.Split", // strconv.Atoi — string-to-int conversion; pure function, no I/O. @@ -104,8 +110,12 @@ var builtinAllowedSymbols = []string{ "strconv.ErrRange", // strconv.NumError — error type for numeric conversion failures; pure type. "strconv.NumError", + // strconv.ParseFloat — string-to-float conversion; pure function, no I/O. + "strconv.ParseFloat", // strconv.ParseInt — string-to-int conversion with base/bit-size; pure function, no I/O. "strconv.ParseInt", + // strconv.ParseUint — string-to-unsigned-int conversion; pure function, no I/O. + "strconv.ParseUint", // strconv.FormatInt — int-to-string conversion; pure function, no I/O. "strconv.FormatInt", // strings.HasPrefix — pure function for prefix matching; no I/O. diff --git a/tests/scenarios/cmd/printf/basic/format_only.yaml b/tests/scenarios/cmd/printf/basic/format_only.yaml new file mode 100644 index 00000000..f6f43674 --- /dev/null +++ b/tests/scenarios/cmd/printf/basic/format_only.yaml @@ -0,0 +1,9 @@ +description: Printf with only a format string and no arguments prints the format string. +input: + script: |+ + printf "hello world\n" +expect: + stdout: |+ + hello world + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/basic/format_reuse.yaml b/tests/scenarios/cmd/printf/basic/format_reuse.yaml new file mode 100644 index 00000000..eb6fec8c --- /dev/null +++ b/tests/scenarios/cmd/printf/basic/format_reuse.yaml @@ -0,0 +1,11 @@ +description: Printf reuses the format string for excess arguments. +input: + script: |+ + printf "%s\n" a b c +expect: + stdout: |+ + a + b + c + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/basic/missing_arg_number.yaml b/tests/scenarios/cmd/printf/basic/missing_arg_number.yaml new file mode 100644 index 00000000..81faf920 --- /dev/null +++ b/tests/scenarios/cmd/printf/basic/missing_arg_number.yaml @@ -0,0 +1,9 @@ +description: Printf uses 0 for missing %d arguments. +input: + script: |+ + printf "%d and %d\n" 42 +expect: + stdout: |+ + 42 and 0 + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/basic/missing_arg_string.yaml b/tests/scenarios/cmd/printf/basic/missing_arg_string.yaml new file mode 100644 index 00000000..0ad88930 --- /dev/null +++ b/tests/scenarios/cmd/printf/basic/missing_arg_string.yaml @@ -0,0 +1,8 @@ +description: Printf uses empty string for missing %s arguments. +input: + script: |+ + printf "%s and %s\n" hello +expect: + stdout: "hello and \n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/basic/multiple_args.yaml b/tests/scenarios/cmd/printf/basic/multiple_args.yaml new file mode 100644 index 00000000..b65f99ed --- /dev/null +++ b/tests/scenarios/cmd/printf/basic/multiple_args.yaml @@ -0,0 +1,9 @@ +description: Printf formats multiple arguments with multiple specifiers. +input: + script: |+ + printf "%s %s\n" hello world +expect: + stdout: |+ + hello world + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/basic/no_args.yaml b/tests/scenarios/cmd/printf/basic/no_args.yaml new file mode 100644 index 00000000..3004f050 --- /dev/null +++ b/tests/scenarios/cmd/printf/basic/no_args.yaml @@ -0,0 +1,8 @@ +description: Printf with no arguments produces an error. +input: + script: |+ + printf +expect: + stdout: "" + stderr_contains: ["printf:"] + exit_code: 1 diff --git a/tests/scenarios/cmd/printf/basic/percent_literal.yaml b/tests/scenarios/cmd/printf/basic/percent_literal.yaml new file mode 100644 index 00000000..0110e4d5 --- /dev/null +++ b/tests/scenarios/cmd/printf/basic/percent_literal.yaml @@ -0,0 +1,9 @@ +description: Printf outputs a literal percent sign with %%. +input: + script: |+ + printf "100%%\n" +expect: + stdout: |+ + 100% + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/basic/simple_string.yaml b/tests/scenarios/cmd/printf/basic/simple_string.yaml new file mode 100644 index 00000000..52bd6c6e --- /dev/null +++ b/tests/scenarios/cmd/printf/basic/simple_string.yaml @@ -0,0 +1,9 @@ +description: Printf formats and prints a simple string with %s specifier. +input: + script: |+ + printf "%s\n" hello +expect: + stdout: |+ + hello + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/errors/invalid_number.yaml b/tests/scenarios/cmd/printf/errors/invalid_number.yaml new file mode 100644 index 00000000..b7eedc83 --- /dev/null +++ b/tests/scenarios/cmd/printf/errors/invalid_number.yaml @@ -0,0 +1,9 @@ +description: Printf with an invalid number argument prints 0 and produces a warning. +input: + script: |+ + printf "%d\n" abc +expect: + stdout: |+ + 0 + stderr_contains: ["printf:"] + exit_code: 1 diff --git a/tests/scenarios/cmd/printf/errors/no_format.yaml b/tests/scenarios/cmd/printf/errors/no_format.yaml new file mode 100644 index 00000000..3004f050 --- /dev/null +++ b/tests/scenarios/cmd/printf/errors/no_format.yaml @@ -0,0 +1,8 @@ +description: Printf with no arguments produces an error. +input: + script: |+ + printf +expect: + stdout: "" + stderr_contains: ["printf:"] + exit_code: 1 diff --git a/tests/scenarios/cmd/printf/errors/rejected_n_specifier.yaml b/tests/scenarios/cmd/printf/errors/rejected_n_specifier.yaml new file mode 100644 index 00000000..1adbf797 --- /dev/null +++ b/tests/scenarios/cmd/printf/errors/rejected_n_specifier.yaml @@ -0,0 +1,8 @@ +description: Printf rejects the %n specifier for safety. +input: + script: |+ + printf "%n" foo +expect: + stdout: "" + stderr_contains: ["printf:"] + exit_code: 1 diff --git a/tests/scenarios/cmd/printf/errors/rejected_v_flag.yaml b/tests/scenarios/cmd/printf/errors/rejected_v_flag.yaml new file mode 100644 index 00000000..3ee2cadf --- /dev/null +++ b/tests/scenarios/cmd/printf/errors/rejected_v_flag.yaml @@ -0,0 +1,9 @@ +skip_assert_against_bash: true +description: Printf rejects the -v flag which assigns to a variable in bash. +input: + script: |+ + printf -v var "%s" hello +expect: + stdout: "" + stderr_contains: ["printf:"] + exit_code: 1 diff --git a/tests/scenarios/cmd/printf/escapes/backslash.yaml b/tests/scenarios/cmd/printf/escapes/backslash.yaml new file mode 100644 index 00000000..d81295b3 --- /dev/null +++ b/tests/scenarios/cmd/printf/escapes/backslash.yaml @@ -0,0 +1,8 @@ +description: Printf interprets double backslash as a literal backslash. +input: + script: |+ + printf "a\\\\b\n" +expect: + stdout: "a\\b\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/escapes/bell_and_others.yaml b/tests/scenarios/cmd/printf/escapes/bell_and_others.yaml new file mode 100644 index 00000000..a1952f30 --- /dev/null +++ b/tests/scenarios/cmd/printf/escapes/bell_and_others.yaml @@ -0,0 +1,8 @@ +description: Printf interprets special escape sequences like bell, backspace, form feed, and vertical tab. +input: + script: |+ + printf "\a\b\f\v" +expect: + stdout: "\a\b\f\v" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/escapes/carriage_return.yaml b/tests/scenarios/cmd/printf/escapes/carriage_return.yaml new file mode 100644 index 00000000..c6b6c489 --- /dev/null +++ b/tests/scenarios/cmd/printf/escapes/carriage_return.yaml @@ -0,0 +1,8 @@ +description: Printf interprets backslash-r as a carriage return. +input: + script: |+ + printf "hello\rworld\n" +expect: + stdout: "hello\rworld\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/escapes/hex.yaml b/tests/scenarios/cmd/printf/escapes/hex.yaml new file mode 100644 index 00000000..f3cb7cf8 --- /dev/null +++ b/tests/scenarios/cmd/printf/escapes/hex.yaml @@ -0,0 +1,9 @@ +description: Printf interprets hex escape sequences in the format string. +input: + script: |+ + printf "\x41\n" +expect: + stdout: |+ + A + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/escapes/newline.yaml b/tests/scenarios/cmd/printf/escapes/newline.yaml new file mode 100644 index 00000000..7d968678 --- /dev/null +++ b/tests/scenarios/cmd/printf/escapes/newline.yaml @@ -0,0 +1,10 @@ +description: Printf interprets backslash-n as a newline in the format string. +input: + script: |+ + printf "a\nb\n" +expect: + stdout: |+ + a + b + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/escapes/octal.yaml b/tests/scenarios/cmd/printf/escapes/octal.yaml new file mode 100644 index 00000000..a9844e2b --- /dev/null +++ b/tests/scenarios/cmd/printf/escapes/octal.yaml @@ -0,0 +1,9 @@ +description: Printf interprets octal escape sequences in the format string. +input: + script: |+ + printf "\101\n" +expect: + stdout: |+ + A + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/escapes/tab.yaml b/tests/scenarios/cmd/printf/escapes/tab.yaml new file mode 100644 index 00000000..2f183ef7 --- /dev/null +++ b/tests/scenarios/cmd/printf/escapes/tab.yaml @@ -0,0 +1,8 @@ +description: Printf interprets backslash-t as a tab in the format string. +input: + script: |+ + printf "a\tb\n" +expect: + stdout: "a\tb\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/numeric/char_constant.yaml b/tests/scenarios/cmd/printf/numeric/char_constant.yaml new file mode 100644 index 00000000..be6d4921 --- /dev/null +++ b/tests/scenarios/cmd/printf/numeric/char_constant.yaml @@ -0,0 +1,9 @@ +description: Printf converts a character constant to its ASCII value. +input: + script: |+ + printf "%d\n" "'A" +expect: + stdout: |+ + 65 + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/numeric/hex_input.yaml b/tests/scenarios/cmd/printf/numeric/hex_input.yaml new file mode 100644 index 00000000..abad7a70 --- /dev/null +++ b/tests/scenarios/cmd/printf/numeric/hex_input.yaml @@ -0,0 +1,9 @@ +description: Printf converts hexadecimal input to decimal. +input: + script: |+ + printf "%d\n" 0xff +expect: + stdout: |+ + 255 + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/numeric/negative.yaml b/tests/scenarios/cmd/printf/numeric/negative.yaml new file mode 100644 index 00000000..2365e634 --- /dev/null +++ b/tests/scenarios/cmd/printf/numeric/negative.yaml @@ -0,0 +1,9 @@ +description: Printf handles negative integer arguments. +input: + script: |+ + printf "%d\n" -42 +expect: + stdout: |+ + -42 + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/numeric/octal_input.yaml b/tests/scenarios/cmd/printf/numeric/octal_input.yaml new file mode 100644 index 00000000..edc06112 --- /dev/null +++ b/tests/scenarios/cmd/printf/numeric/octal_input.yaml @@ -0,0 +1,9 @@ +description: Printf converts octal input to decimal. +input: + script: |+ + printf "%d\n" 0755 +expect: + stdout: |+ + 493 + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/numeric/zero.yaml b/tests/scenarios/cmd/printf/numeric/zero.yaml new file mode 100644 index 00000000..83a7adcf --- /dev/null +++ b/tests/scenarios/cmd/printf/numeric/zero.yaml @@ -0,0 +1,9 @@ +description: Printf handles zero as an integer argument. +input: + script: |+ + printf "%d\n" 0 +expect: + stdout: |+ + 0 + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/shell_features/command_substitution.yaml b/tests/scenarios/cmd/printf/shell_features/command_substitution.yaml new file mode 100644 index 00000000..3ddade23 --- /dev/null +++ b/tests/scenarios/cmd/printf/shell_features/command_substitution.yaml @@ -0,0 +1,10 @@ +description: Printf output can be captured via command substitution. +skip_assert_against_bash: true +input: + script: |+ + X=$(printf "%05d" 42); echo "$X" +expect: + stdout: "" + stderr: |+ + command substitution is not supported + exit_code: 2 diff --git a/tests/scenarios/cmd/printf/shell_features/in_for_loop.yaml b/tests/scenarios/cmd/printf/shell_features/in_for_loop.yaml new file mode 100644 index 00000000..9bad84c1 --- /dev/null +++ b/tests/scenarios/cmd/printf/shell_features/in_for_loop.yaml @@ -0,0 +1,8 @@ +description: Printf works inside a for loop. +input: + script: |+ + for i in 1 2 3; do printf "%d " "$i"; done; printf "\n" +expect: + stdout: "1 2 3 \n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/shell_features/in_pipeline.yaml b/tests/scenarios/cmd/printf/shell_features/in_pipeline.yaml new file mode 100644 index 00000000..5a124df1 --- /dev/null +++ b/tests/scenarios/cmd/printf/shell_features/in_pipeline.yaml @@ -0,0 +1,9 @@ +description: Printf output can be piped to another command. +input: + script: |+ + printf "%s\n" hello | cat +expect: + stdout: |+ + hello + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/shell_features/variable_expansion.yaml b/tests/scenarios/cmd/printf/shell_features/variable_expansion.yaml new file mode 100644 index 00000000..a1ef4967 --- /dev/null +++ b/tests/scenarios/cmd/printf/shell_features/variable_expansion.yaml @@ -0,0 +1,9 @@ +description: Printf works with shell variable expansion. +input: + script: |+ + NAME=world; printf "hello %s\n" "$NAME" +expect: + stdout: |+ + hello world + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/specifiers/b_escape.yaml b/tests/scenarios/cmd/printf/specifiers/b_escape.yaml new file mode 100644 index 00000000..53f252ff --- /dev/null +++ b/tests/scenarios/cmd/printf/specifiers/b_escape.yaml @@ -0,0 +1,8 @@ +description: Printf %b specifier interprets backslash escapes in the argument. +input: + script: |+ + printf "%b\n" 'hello\tworld' +expect: + stdout: "hello\tworld\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/specifiers/b_with_backslash_c.yaml b/tests/scenarios/cmd/printf/specifiers/b_with_backslash_c.yaml new file mode 100644 index 00000000..5e076da2 --- /dev/null +++ b/tests/scenarios/cmd/printf/specifiers/b_with_backslash_c.yaml @@ -0,0 +1,8 @@ +description: Printf %b with backslash-c in argument stops output immediately. +input: + script: |+ + printf "%b" 'hello\cworld' +expect: + stdout: "hello" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/specifiers/char_c.yaml b/tests/scenarios/cmd/printf/specifiers/char_c.yaml new file mode 100644 index 00000000..46bac25d --- /dev/null +++ b/tests/scenarios/cmd/printf/specifiers/char_c.yaml @@ -0,0 +1,9 @@ +description: Printf %c specifier outputs the first character of the argument. +input: + script: |+ + printf "%c\n" A +expect: + stdout: |+ + A + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/specifiers/decimal_d.yaml b/tests/scenarios/cmd/printf/specifiers/decimal_d.yaml new file mode 100644 index 00000000..5a309e05 --- /dev/null +++ b/tests/scenarios/cmd/printf/specifiers/decimal_d.yaml @@ -0,0 +1,9 @@ +description: Printf %d specifier outputs a decimal integer. +input: + script: |+ + printf "%d\n" 42 +expect: + stdout: |+ + 42 + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/specifiers/float_f.yaml b/tests/scenarios/cmd/printf/specifiers/float_f.yaml new file mode 100644 index 00000000..4eb36928 --- /dev/null +++ b/tests/scenarios/cmd/printf/specifiers/float_f.yaml @@ -0,0 +1,9 @@ +description: Printf %f specifier outputs a floating point number with default precision. +input: + script: |+ + printf "%f\n" 3.14 +expect: + stdout: |+ + 3.140000 + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/specifiers/hex_lower.yaml b/tests/scenarios/cmd/printf/specifiers/hex_lower.yaml new file mode 100644 index 00000000..bf670a9c --- /dev/null +++ b/tests/scenarios/cmd/printf/specifiers/hex_lower.yaml @@ -0,0 +1,9 @@ +description: Printf %x specifier outputs lowercase hexadecimal. +input: + script: |+ + printf "%x\n" 255 +expect: + stdout: |+ + ff + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/specifiers/hex_upper.yaml b/tests/scenarios/cmd/printf/specifiers/hex_upper.yaml new file mode 100644 index 00000000..05102eb0 --- /dev/null +++ b/tests/scenarios/cmd/printf/specifiers/hex_upper.yaml @@ -0,0 +1,9 @@ +description: Printf %X specifier outputs uppercase hexadecimal. +input: + script: |+ + printf "%X\n" 255 +expect: + stdout: |+ + FF + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/specifiers/integer_i.yaml b/tests/scenarios/cmd/printf/specifiers/integer_i.yaml new file mode 100644 index 00000000..10c4279b --- /dev/null +++ b/tests/scenarios/cmd/printf/specifiers/integer_i.yaml @@ -0,0 +1,9 @@ +description: Printf %i specifier outputs a decimal integer (same as %d). +input: + script: |+ + printf "%i\n" 42 +expect: + stdout: |+ + 42 + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/specifiers/octal_o.yaml b/tests/scenarios/cmd/printf/specifiers/octal_o.yaml new file mode 100644 index 00000000..dd6af69c --- /dev/null +++ b/tests/scenarios/cmd/printf/specifiers/octal_o.yaml @@ -0,0 +1,9 @@ +description: Printf %o specifier outputs an octal representation. +input: + script: |+ + printf "%o\n" 255 +expect: + stdout: |+ + 377 + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/specifiers/scientific_e.yaml b/tests/scenarios/cmd/printf/specifiers/scientific_e.yaml new file mode 100644 index 00000000..a8fd73d7 --- /dev/null +++ b/tests/scenarios/cmd/printf/specifiers/scientific_e.yaml @@ -0,0 +1,9 @@ +description: Printf %e specifier outputs a number in scientific notation. +input: + script: |+ + printf "%e\n" 3.14 +expect: + stdout: |+ + 3.140000e+00 + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/specifiers/shortest_g.yaml b/tests/scenarios/cmd/printf/specifiers/shortest_g.yaml new file mode 100644 index 00000000..ceffe019 --- /dev/null +++ b/tests/scenarios/cmd/printf/specifiers/shortest_g.yaml @@ -0,0 +1,9 @@ +description: Printf %g specifier outputs the shortest representation of a float. +input: + script: |+ + printf "%g\n" 3.14 +expect: + stdout: |+ + 3.14 + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/specifiers/string_s.yaml b/tests/scenarios/cmd/printf/specifiers/string_s.yaml new file mode 100644 index 00000000..b0e4b131 --- /dev/null +++ b/tests/scenarios/cmd/printf/specifiers/string_s.yaml @@ -0,0 +1,9 @@ +description: Printf %s specifier outputs a string argument. +input: + script: |+ + printf "%s\n" hello +expect: + stdout: |+ + hello + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/specifiers/unsigned_u.yaml b/tests/scenarios/cmd/printf/specifiers/unsigned_u.yaml new file mode 100644 index 00000000..8093ae18 --- /dev/null +++ b/tests/scenarios/cmd/printf/specifiers/unsigned_u.yaml @@ -0,0 +1,9 @@ +description: Printf %u specifier outputs an unsigned decimal integer. +input: + script: |+ + printf "%u\n" 42 +expect: + stdout: |+ + 42 + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/width_precision/left_align.yaml b/tests/scenarios/cmd/printf/width_precision/left_align.yaml new file mode 100644 index 00000000..de1a3fa5 --- /dev/null +++ b/tests/scenarios/cmd/printf/width_precision/left_align.yaml @@ -0,0 +1,9 @@ +description: Printf left-aligns a string within a specified width using the minus flag. +input: + script: |+ + printf "%-10s|\n" hi +expect: + stdout: |+ + hi | + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/width_precision/precision_float.yaml b/tests/scenarios/cmd/printf/width_precision/precision_float.yaml new file mode 100644 index 00000000..6d7c64ac --- /dev/null +++ b/tests/scenarios/cmd/printf/width_precision/precision_float.yaml @@ -0,0 +1,9 @@ +description: Printf applies precision to a floating point number. +input: + script: |+ + printf "%.2f\n" 3.14159 +expect: + stdout: |+ + 3.14 + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/width_precision/precision_string.yaml b/tests/scenarios/cmd/printf/width_precision/precision_string.yaml new file mode 100644 index 00000000..b212a37b --- /dev/null +++ b/tests/scenarios/cmd/printf/width_precision/precision_string.yaml @@ -0,0 +1,9 @@ +description: Printf applies precision to truncate a string. +input: + script: |+ + printf "%.3s\n" hello +expect: + stdout: |+ + hel + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/width_precision/right_align.yaml b/tests/scenarios/cmd/printf/width_precision/right_align.yaml new file mode 100644 index 00000000..75fdae9b --- /dev/null +++ b/tests/scenarios/cmd/printf/width_precision/right_align.yaml @@ -0,0 +1,8 @@ +description: Printf right-aligns a string within a specified width. +input: + script: |+ + printf "%10s\n" hi +expect: + stdout: " hi\n" + stderr: "" + exit_code: 0 diff --git a/tests/scenarios/cmd/printf/width_precision/zero_pad.yaml b/tests/scenarios/cmd/printf/width_precision/zero_pad.yaml new file mode 100644 index 00000000..3e3da0f2 --- /dev/null +++ b/tests/scenarios/cmd/printf/width_precision/zero_pad.yaml @@ -0,0 +1,9 @@ +description: Printf zero-pads a number to a specified width. +input: + script: |+ + printf "%05d\n" 42 +expect: + stdout: |+ + 00042 + stderr: "" + exit_code: 0 From f41229ac4324af2da066960de1f17d4ca0c5049d Mon Sep 17 00:00:00 2001 From: Alexandre Yang Date: Thu, 12 Mar 2026 00:57:08 +0100 Subject: [PATCH 02/20] update .claude/skills/code-review/SKILL.md --- .claude/skills/code-review/SKILL.md | 54 +++-------------------------- 1 file changed, 4 insertions(+), 50 deletions(-) diff --git a/.claude/skills/code-review/SKILL.md b/.claude/skills/code-review/SKILL.md index 84d52ae5..c9553c49 100644 --- a/.claude/skills/code-review/SKILL.md +++ b/.claude/skills/code-review/SKILL.md @@ -114,56 +114,10 @@ For every behavioral change: ### D. Test Coverage -Analyze coverage of changed code from two angles: **scenario tests** (YAML) and **Go tests**. Scenario tests are preferred because they also verify bash compatibility. - -#### Step 1: Inventory changed code paths - -For each changed or added function/branch/error-path, list the code path (e.g. "cut: `-f` with `--complement` and `--output-delimiter`", "error when delimiter is multi-byte"). - -#### Step 2: Check scenario test coverage (priority) - -Search `tests/scenarios/cmd//` for YAML scenarios that exercise each code path identified in Step 1. - -- **Covered** — a scenario exists whose `input.script` triggers the code path and `expect` asserts the output. -- **Partially covered** — a scenario triggers the code path but doesn't assert stderr, exit code, or an important edge case. -- **Not covered** — no scenario exercises the code path. - -Flag **not covered** and **partially covered** paths as findings. Suggest concrete YAML scenario(s) to add (including `description`, `input.script`, and expected `stdout`/`stderr`/`exit_code`). - -Scenario test conventions: -- Prefer `expect.stderr` (exact match) over `stderr_contains` -- Tests are asserted against bash by default — only use `skip_assert_against_bash: true` for intentional divergence -- Use `stdout_windows`/`stderr_windows` for platform-specific output -- If YAML scenarios are added or modified, verify they pass against bash - -#### Step 3: Check Go test coverage - -Search `interp/builtins//*_test.go` for Go tests that exercise any code paths **not already covered by scenario tests**. Go test types to check: - -| Test type | File pattern | What it covers | -|-----------|-------------|----------------| -| Functional | `_test.go` | Core logic, argument parsing, edge cases | -| GNU compat | `_gnu_compat_test.go` | Byte-for-byte output equivalence with GNU coreutils | -| Pentest | `_pentest_test.go` | Security vectors (overflow, special files, resource exhaustion) | -| Platform | `_{unix,windows}_test.go` | OS-specific behavior | - -Only flag missing Go tests for paths that **cannot be adequately covered by scenario tests** (e.g. internal error handling, concurrency, memory limits, platform-specific behavior, performance-sensitive paths). - -#### Step 4: Produce coverage summary - -Include a coverage table in the review output: - -```markdown -| Code path | Scenario test | Go test | Status | -|-----------|:---:|:---:|--------| -| `-f` with `--complement` | tests/scenarios/cmd/cut/complement/fields.yaml | — | Covered | -| multi-byte delimiter error | — | — | **Missing** | -| `/dev/zero` hang protection | skip (intentional divergence) | cut_pentest_test.go:45 | Covered | -``` - -Mark the overall coverage status: -- **Adequate** — all new/changed code paths are covered (scenario or Go tests) -- **Gaps found** — list missing coverage as P2 or P3 findings +- **Are new behaviors tested?** Every new code path should have a corresponding test +- **Are edge cases tested?** Empty input, boundary values, error conditions +- **YAML scenario conventions**: prefer `expect.stderr` over `stderr_contains`; tests are asserted against bash by default; use `stdout_windows`/`stderr_windows` for platform-specific output +- **Bash comparison**: if YAML scenarios are added or modified, verify they pass against bash ### E. Code Quality From d3c15e41225ff6b32194d099937489e15f4c43ed Mon Sep 17 00:00:00 2001 From: Alexandre Yang Date: Thu, 12 Mar 2026 01:03:25 +0100 Subject: [PATCH 03/20] update .claude/skills/code-review/SKILL.md --- .claude/skills/code-review/SKILL.md | 96 +++++++++++++++++++++++++++-- 1 file changed, 92 insertions(+), 4 deletions(-) diff --git a/.claude/skills/code-review/SKILL.md b/.claude/skills/code-review/SKILL.md index c9553c49..0fa2b4fe 100644 --- a/.claude/skills/code-review/SKILL.md +++ b/.claude/skills/code-review/SKILL.md @@ -114,10 +114,56 @@ For every behavioral change: ### D. Test Coverage -- **Are new behaviors tested?** Every new code path should have a corresponding test -- **Are edge cases tested?** Empty input, boundary values, error conditions -- **YAML scenario conventions**: prefer `expect.stderr` over `stderr_contains`; tests are asserted against bash by default; use `stdout_windows`/`stderr_windows` for platform-specific output -- **Bash comparison**: if YAML scenarios are added or modified, verify they pass against bash +Analyze coverage of changed code from two angles: **scenario tests** (YAML) and **Go tests**. Scenario tests are preferred because they also verify bash compatibility. + +#### Step 1: Inventory changed code paths + +For each changed or added function/branch/error-path, list the code path (e.g. "cut: `-f` with `--complement` and `--output-delimiter`", "error when delimiter is multi-byte"). + +#### Step 2: Check scenario test coverage (priority) + +Search `tests/scenarios/cmd//` for YAML scenarios that exercise each code path identified in Step 1. + +- **Covered** — a scenario exists whose `input.script` triggers the code path and `expect` asserts the output. +- **Partially covered** — a scenario triggers the code path but doesn't assert stderr, exit code, or an important edge case. +- **Not covered** — no scenario exercises the code path. + +Flag **not covered** and **partially covered** paths as findings. Suggest concrete YAML scenario(s) to add (including `description`, `input.script`, and expected `stdout`/`stderr`/`exit_code`). + +Scenario test conventions: +- Prefer `expect.stderr` (exact match) over `stderr_contains` +- Tests are asserted against bash by default — only use `skip_assert_against_bash: true` for intentional divergence +- Use `stdout_windows`/`stderr_windows` for platform-specific output +- If YAML scenarios are added or modified, verify they pass against bash + +#### Step 3: Check Go test coverage + +Search `interp/builtins//*_test.go` for Go tests that exercise any code paths **not already covered by scenario tests**. Go test types to check: + +| Test type | File pattern | What it covers | +|-----------|-------------|----------------| +| Functional | `_test.go` | Core logic, argument parsing, edge cases | +| GNU compat | `_gnu_compat_test.go` | Byte-for-byte output equivalence with GNU coreutils | +| Pentest | `_pentest_test.go` | Security vectors (overflow, special files, resource exhaustion) | +| Platform | `_{unix,windows}_test.go` | OS-specific behavior | + +Only flag missing Go tests for paths that **cannot be adequately covered by scenario tests** (e.g. internal error handling, concurrency, memory limits, platform-specific behavior, performance-sensitive paths). + +#### Step 4: Produce coverage summary + +Include a coverage table in the review output: + +```markdown +| Code path | Scenario test | Go test | Status | +|-----------|:---:|:---:|--------| +| `-f` with `--complement` | tests/scenarios/cmd/cut/complement/fields.yaml | — | Covered | +| multi-byte delimiter error | — | — | **Missing** | +| `/dev/zero` hang protection | skip (intentional divergence) | cut_pentest_test.go:45 | Covered | +``` + +Mark the overall coverage status: +- **Adequate** — all new/changed code paths are covered (scenario or Go tests) +- **Gaps found** — list missing coverage as P2 or P3 findings ### E. Code Quality @@ -133,6 +179,48 @@ For every behavioral change: - Platform-aware path handling (not string concatenation)? - Are platform-specific test assertions using the correct fields? +### G. Unnecessary `skip_assert_against_bash: true` + +Every YAML scenario in `tests/scenarios/` is validated against bash by default. The `skip_assert_against_bash: true` flag must **only** be set when the shell intentionally diverges from bash (e.g. sandbox restrictions, blocked commands, readonly enforcement, different help/usage text). + +#### How to check + +1. **Find all scenarios with `skip_assert_against_bash: true`** in the changed or added YAML files: + ```bash + grep -rl 'skip_assert_against_bash: true' tests/scenarios/cmd// + ``` + +2. **For each flagged scenario**, run its script against GNU bash + coreutils to see what bash actually produces: + ```bash + docker run --rm debian:bookworm-slim bash -c '