Skip to content

fix: prevent rtk read from corrupting JSON/data files (#464)#522

Merged
pszymkowiak merged 1 commit intortk-ai:developfrom
ousamabenyounes:fix/read-json-corruption
Mar 17, 2026
Merged

fix: prevent rtk read from corrupting JSON/data files (#464)#522
pszymkowiak merged 1 commit intortk-ai:developfrom
ousamabenyounes:fix/read-json-corruption

Conversation

@ousamabenyounes
Copy link
Copy Markdown
Contributor

Summary

Fixes #464rtk read package.json corrupts JSON files when string values contain /* or */.

Root cause: .json files were classified as Language::Unknown, which uses /* / */ as block comment delimiters. The string "packages/*" was interpreted as opening a block comment, and "**/package.json" as closing it — everything between was silently deleted.

Fix: Add Language::Data variant with no comment patterns. JSON, YAML, TOML, XML, CSV, Markdown, and other data formats skip all comment stripping and code filtering entirely.

Before/After

# BEFORE: rtk read corrupts the JSON structure
$ rtk read package.json
{
  "name": "my-monorepo",
  "workspaces": {
    "packages": [
      "sort-package-json",                    # <-- from lint-staged!
      "biome check --write --no-errors-on-unmatched"
    ]
  }
}
# scripts: MISSING, lint-staged: MISSING, catalog: MISSING

# AFTER: JSON structure fully preserved
$ rtk read package.json
{
  "name": "my-monorepo",
  "workspaces": {
    "packages": [
      "packages/*"                            # <-- correct
    ],
    "catalog": { ... }
  },
  "scripts": {                                # <-- preserved
    "build": "bun run --workspaces build",
    "lint": "bun run --workspaces lint"
  },
  "lint-staged": {                            # <-- preserved
    "**/package.json": [ ... ]
  }
}

Why this matters

package.json is read dozens of times per Claude session. When corrupted:

  • Claude sees wrong workspace config, missing scripts, broken lint rules
  • Makes decisions based on false metadata (wrong build commands, missing dependencies)
  • Re-reads the file in a loop trying to reconcile the corruption = wasted tokens

Also affects: tsconfig.json, docker-compose.yml, Cargo.lock, .env, schema.graphql, *.sql, etc.

Affected extensions

json, jsonc, json5, yaml, yml, toml, xml, csv, tsv, graphql, gql, sql, md, markdown, txt, env, lock

Test plan

  • test_language_detection_data_formats — all data extensions map to Language::Data
  • test_json_no_comment_stripping — reproduces exact bug: rtk read corrupts package.json when JSON strings contain /* or */ #464 scenario (packages/*, scripts, lint-staged)
  • test_json_aggressive_filter_preserves_structure — aggressive filter also safe for JSON
  • Full suite: 767 passed, 0 failed
  • Manual test: rtk read package.json with the exact fixture from the issue
  • validate-docs.sh passes

Generated with Claude Code

@ousamabenyounes ousamabenyounes changed the base branch from master to develop March 12, 2026 03:17
@pszymkowiak
Copy link
Copy Markdown
Collaborator

LGTM — good fix for a nasty bug. Adding Language::Data with empty comment patterns is the right approach.

Build passes, 55 filter tests pass. The exact reproduction case from #464 (packages/* treated as block comment) is covered.

One nit (non-blocking): the version bumps in ARCHITECTURE.md/CLAUDE.md/README.md are unrelated noise — ideally drop them from this PR since release-please handles versions. But not a blocker.

@aeppling for merge.

@pszymkowiak pszymkowiak requested a review from aeppling March 12, 2026 20:25
@ousamabenyounes ousamabenyounes force-pushed the fix/read-json-corruption branch from cded07a to 5f4ea62 Compare March 12, 2026 21:53
@ousamabenyounes
Copy link
Copy Markdown
Contributor Author

Thanks! Removed the version bumps from ARCHITECTURE.md, CLAUDE.md and README.md — rebased on latest develop first, so the PR now only touches src/filter.rs (1 file, 42+/10-).

Note: the Benchmark and Documentation Validation checks are failing due to pre-existing issues on develop (missing truncate function in cargo_cmd.rs:838, and version string mismatch in docs). Not related to this PR.

Add Language::Data variant for data formats (JSON, YAML, TOML, XML, CSV, etc.)
with empty comment patterns to prevent comment stripping. AggressiveFilter
falls back to MinimalFilter for data files.

Fixes rtk-ai#464

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Ousama Ben Younes <benyounes.ousama@gmail.com>
@ousamabenyounes ousamabenyounes force-pushed the fix/read-json-corruption branch from 5f4ea62 to 9533614 Compare March 13, 2026 11:36
@pszymkowiak pszymkowiak merged commit cf1a790 into rtk-ai:develop Mar 17, 2026
2 of 3 checks passed
pszymkowiak added a commit that referenced this pull request Mar 18, 2026
* fix: P1 exit codes, grep regex perf, SQLite WAL (#631)

* fix: P1 exit codes, grep regex perf, SQLite concurrency

Exit code propagation (same pattern as existing modules):
- wget_cmd: run() and run_stdout() now exit on failure
- container: docker_logs, kubectl_pods/services/logs now check
  status before parsing JSON (was showing "No pods found" on error)
- pnpm_cmd: replace bail!() with eprint + process::exit in
  run_list and run_install

Performance:
- grep_cmd: compile context regex once before loop instead of
  per-line in clean_line() (was N compilations per grep call)

Data integrity:
- tracking: add PRAGMA journal_mode=WAL and busy_timeout=5000
  to prevent SQLite corruption with concurrent Claude Code instances

Signed-off-by: Patrick <patrick@rtk.ai>
Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* fix: address review findings on P1 fixes

- tracking: WAL pragma non-fatal (NFS/read-only compat)
- wget: forward raw stderr on failure, track raw==raw (no fake savings)
- container: remove stderr shadow in docker_logs, add empty-stderr
  guard on all 4 new exit code paths for consistency with prisma pattern

Signed-off-by: Patrick <patrick@rtk.ai>
Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

---------

Signed-off-by: Patrick <patrick@rtk.ai>
Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* fix: raise output caps for P0 bugs (#617, #618, #620) (#630)

* fix: raise output caps for grep, git status, and parser fallback (#617, #618, #620)

- grep: per-file match cap 10 → 25, global max 50 → 200
- git status: file list caps 5/5/3 → 15/15/10
- parser fallback: truncate 500 → 2000 chars across all modules

These P0 bugs caused LLM retry loops when RTK returned less signal
than the raw command, making RTK worse than not using it.

Fixes #617, #618, #620

Signed-off-by: Patrick <patrick@rtk.ai>
Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* fix: update README example and add truncation tests for modified/untracked

- parser/README.md: update example from 500 → 2000 to match code
- git.rs: add test_format_status_modified_truncation (cap 15)
- git.rs: add test_format_status_untracked_truncation (cap 10)

Signed-off-by: Patrick <patrick@rtk.ai>
Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* refactor: extract output caps into [limits] config section

Move hardcoded caps into config.toml so users can tune them:

  [limits]
  grep_max_results = 200      # global grep match limit
  grep_max_per_file = 25      # per-file match limit
  status_max_files = 15       # staged/modified file list cap
  status_max_untracked = 10   # untracked file list cap
  passthrough_max_chars = 2000 # parser fallback truncation

All 8 modules now read from config::limits() instead of hardcoded
values. Defaults unchanged from previous commit.

Signed-off-by: Patrick <patrick@rtk.ai>
Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

---------

Signed-off-by: Patrick <patrick@rtk.ai>
Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* feat(.claude): add /rtk-triage skill — cross-analysis of PRs and issues (#662)

* feat(.claude): add /rtk-triage skill — orchestrated PR+issue cross-analysis

New skill that runs issue-triage + pr-triage in parallel then produces
a cross-analysis layer that neither skill can do individually:

- Double coverage detection: identifies when 2+ PRs target the same issue
  (via body scan + file overlap), recommends which to keep/close
- Security gap detection: for security review issues, maps each finding
  to a PR (or flags it as uncovered)
- P0/P1 bugs without PR: groups by pattern to suggest sprint batching
- Our dirty PRs: identifies probable cause (conflict with sibling PR,
  needs rebase, missing linked issue)

Output is saved automatically to claudedocs/RTK-YYYY-MM-DD.md.

Usage: /rtk-triage           (French, auto-save)
       /rtk-triage en        (English output)

Signed-off-by: Florian Bruniaux <florian@bel-etage.com>
Signed-off-by: Florian BRUNIAUX <florian@bruniaux.com>

* docs(architecture): update module count to 66

Sync ARCHITECTURE.md with current main.rs state.
Previous count (60) was stale since several modules were added
(dotnet_cmd, dotnet_format_report, dotnet_trx, npm_cmd, gt_cmd, etc.).

Signed-off-by: Florian Bruniaux <florian@bel-etage.com>
Signed-off-by: Florian BRUNIAUX <florian@bruniaux.com>

---------

Signed-off-by: Florian Bruniaux <florian@bel-etage.com>
Signed-off-by: Florian BRUNIAUX <florian@bruniaux.com>

* fix: subcommand routing drops unrecognized subcommands (#600) (#601)

- git stash: pass unknown subcommands (save, branch, clear) through
  instead of silently falling back to git stash push
- git branch: add --show-current, --set-upstream-to, --format, --sort
  to flag detection so they don't get overridden by -a injection
- pip: replace bail!() with passthrough for unknown subcommands
  (freeze, download, wheel, etc.)

Fixes #600

Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* fix: resolve cargo fmt + 54 clippy warnings blocking CI (#663)

cargo fmt diffs in config.rs, git.rs, playwright_cmd.rs were failing
the fmt CI check, which cascaded to block clippy/test/security on
PRs #632, #635, #638. Also fixes all clippy warnings: dead code
annotations, iterator simplifications, assert patterns, and
unnecessary allocations.

Signed-off-by: Patrick Szymkowiak <patrick@rtk-ai.app>
Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* fix: discover absolute paths + git global options (#485, #163) (#518)

* fix: discover classifies absolute paths like /usr/bin/grep (#485)

Normalize absolute binary paths before classification:
/usr/bin/grep → grep, /bin/ls → ls, /usr/local/bin/git → git

Adds strip_absolute_path() helper + 5 tests.

Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* fix: discover and rewrite support git global options -C, --no-pager, etc. (#163)

Strip git global options (-C <path>, -c <key=val>, --git-dir, --work-tree,
--no-pager, --no-optional-locks, --bare, --literal-pathspecs) before
classification so git -C /tmp status is recognized as rtk git.

Rewrite preserves global options: git -C /tmp status → rtk git -C /tmp status

Adds GIT_GLOBAL_OPT lazy_static regex + strip_git_global_opts() helper + 6 tests.

Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

---------

Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* fix: prevent double `--` separator in cargo clippy with -p flags (#519)

When running `rtk cargo clippy -p my-crate -- -D warnings`, Clap with
`trailing_var_arg = true` preserves the `--` in parsed args when flags
precede it. `restore_double_dash()` then added a second `--`, producing
`cargo clippy -p my-crate -- -- -D warnings`. This caused rustc to
interpret `-D` as a filename instead of a lint flag.

Fix: skip restoration when args already contain `--` (Clap preserved it).

Fixes #496

Signed-off-by: Ousama Ben Younes <benyounes.ousama@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* ci: add PR template + target branch check (#521)

- PR template reminds contributors to target develop
- CI workflow labels PRs targeting master with 'wrong-base' and posts a comment
- Excludes develop→master PRs (maintainer releases)

Signed-off-by: Patrick <patrick@rtk-ai.com>
Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* fix: prevent rtk read from corrupting JSON/YAML/data files (#522)

Add Language::Data variant for data formats (JSON, YAML, TOML, XML, CSV, etc.)
with empty comment patterns to prevent comment stripping. AggressiveFilter
falls back to MinimalFilter for data files.

Fixes #464

Signed-off-by: Ousama Ben Younes <benyounes.ousama@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* fix: skip rewriting find/fd in pipes to preserve xargs compatibility (#439) (#563)

rtk find outputs a grouped format incompatible with pipe consumers
like xargs, grep, wc, sort. Skip rewrite when find/fd is followed
by a pipe, preserving native one-per-line output.

Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* fix: add hint when git diff is truncated + fix --no-compact passthrough (#427) (#564)

When compact_diff truncates output, append a hint line so Claude knows
how to get the full diff: [full diff: rtk git diff --no-compact]

Also fix --no-compact flag being passed to git (causing usage error)
and remove decorative emoji from compact_diff output.

Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* fix: propagate exit codes in git diff, status+args, commit, and branch (#632)

4 P1 bugs where git exit codes were swallowed:
- git diff: failure silently printed empty stat output
- git status (with args): failure was filtered instead of propagated
- git commit: failure printed "FAILED" but returned Ok(()) breaking pre-commit hooks
- git branch (list mode): failure was silently ignored

All now follow the established pattern: eprint stderr, track raw==raw, process::exit(code).

Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* feat: add 5 new TOML filters (ollama, nx, gradle, spring-boot, jira) (#635)

* feat: add 5 new TOML built-in filters (ollama, nx, gradle, spring-boot, jira)

New filters for commands not covered by Rust modules:
- ollama: strip ANSI spinners, keep final text response (#624)
- nx: strip Nx monorepo noise, keep build results (#444)
- gradle/gradlew: strip UP-TO-DATE tasks, keep build summary (#147)
- spring-boot: strip banner and verbose logs, keep startup/errors (#147)
- jira: strip blanks, truncate wide columns (#524)

All 5 filters pass inline tests via rtk verify (123/123).
Updated builtin filter count: 47 -> 52.

Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* feat: add 5 more TOML filters (turbo, mise, just, task, yadm)

New filters for task runners and git wrapper:
- turbo: strip cache/Tasks/Duration noise, keep task output (#531)
- mise: strip install/download progress, keep task results (#607)
- just: strip blanks and recipe headers, keep output (#607)
- task: strip task headers and up-to-date lines, keep results (#607)
- yadm: strip hint lines, compact git-like output (#567)

All verified with fake binaries through catch-all TOML engine.
137/137 TOML tests pass, 934 Rust tests pass.
Updated builtin filter count: 52 -> 57.

Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

---------

Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* fix: replace emojis with plain text in git status output (#603) (#638)

Git status output used emojis (📌, 📝, ❓, ✅, ⚠️) that confuse
non-Claude LLMs (GPT, etc.) causing retry loops. Replace with plain
text labels (branch:, modified:, staged:, untracked:, conflicts:).

Also add "clean — nothing to commit" when working tree is clean,
so LLMs understand the repo state without ambiguity.

Before: 📌 master
After:  branch: master
        clean — nothing to commit

Fixes #603

Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

---------

Signed-off-by: Patrick <patrick@rtk.ai>
Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>
Signed-off-by: Florian Bruniaux <florian@bel-etage.com>
Signed-off-by: Florian BRUNIAUX <florian@bruniaux.com>
Signed-off-by: Patrick Szymkowiak <patrick@rtk-ai.app>
Signed-off-by: Ousama Ben Younes <benyounes.ousama@gmail.com>
Signed-off-by: Patrick <patrick@rtk-ai.com>
Co-authored-by: Florian BRUNIAUX <florian@bruniaux.com>
Co-authored-by: Ben Younes <benyounes.ousama@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
FlorianBruniaux pushed a commit that referenced this pull request Mar 19, 2026
Add Language::Data variant for data formats (JSON, YAML, TOML, XML, CSV, etc.)
with empty comment patterns to prevent comment stripping. AggressiveFilter
falls back to MinimalFilter for data files.

Fixes #464

Signed-off-by: Ousama Ben Younes <benyounes.ousama@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Florian BRUNIAUX <florian@bruniaux.com>
FlorianBruniaux pushed a commit that referenced this pull request Mar 19, 2026
Add Language::Data variant for data formats (JSON, YAML, TOML, XML, CSV, etc.)
with empty comment patterns to prevent comment stripping. AggressiveFilter
falls back to MinimalFilter for data files.

Fixes #464

Signed-off-by: Ousama Ben Younes <benyounes.ousama@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Florian BRUNIAUX <florian@bruniaux.com>
helgeu pushed a commit to helgeu/rtk that referenced this pull request Mar 27, 2026
* fix: P1 exit codes, grep regex perf, SQLite WAL (rtk-ai#631)

* fix: P1 exit codes, grep regex perf, SQLite concurrency

Exit code propagation (same pattern as existing modules):
- wget_cmd: run() and run_stdout() now exit on failure
- container: docker_logs, kubectl_pods/services/logs now check
  status before parsing JSON (was showing "No pods found" on error)
- pnpm_cmd: replace bail!() with eprint + process::exit in
  run_list and run_install

Performance:
- grep_cmd: compile context regex once before loop instead of
  per-line in clean_line() (was N compilations per grep call)

Data integrity:
- tracking: add PRAGMA journal_mode=WAL and busy_timeout=5000
  to prevent SQLite corruption with concurrent Claude Code instances

Signed-off-by: Patrick <patrick@rtk.ai>
Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* fix: address review findings on P1 fixes

- tracking: WAL pragma non-fatal (NFS/read-only compat)
- wget: forward raw stderr on failure, track raw==raw (no fake savings)
- container: remove stderr shadow in docker_logs, add empty-stderr
  guard on all 4 new exit code paths for consistency with prisma pattern

Signed-off-by: Patrick <patrick@rtk.ai>
Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

---------

Signed-off-by: Patrick <patrick@rtk.ai>
Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* fix: raise output caps for P0 bugs (rtk-ai#617, rtk-ai#618, rtk-ai#620) (rtk-ai#630)

* fix: raise output caps for grep, git status, and parser fallback (rtk-ai#617, rtk-ai#618, rtk-ai#620)

- grep: per-file match cap 10 → 25, global max 50 → 200
- git status: file list caps 5/5/3 → 15/15/10
- parser fallback: truncate 500 → 2000 chars across all modules

These P0 bugs caused LLM retry loops when RTK returned less signal
than the raw command, making RTK worse than not using it.

Fixes rtk-ai#617, rtk-ai#618, rtk-ai#620

Signed-off-by: Patrick <patrick@rtk.ai>
Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* fix: update README example and add truncation tests for modified/untracked

- parser/README.md: update example from 500 → 2000 to match code
- git.rs: add test_format_status_modified_truncation (cap 15)
- git.rs: add test_format_status_untracked_truncation (cap 10)

Signed-off-by: Patrick <patrick@rtk.ai>
Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* refactor: extract output caps into [limits] config section

Move hardcoded caps into config.toml so users can tune them:

  [limits]
  grep_max_results = 200      # global grep match limit
  grep_max_per_file = 25      # per-file match limit
  status_max_files = 15       # staged/modified file list cap
  status_max_untracked = 10   # untracked file list cap
  passthrough_max_chars = 2000 # parser fallback truncation

All 8 modules now read from config::limits() instead of hardcoded
values. Defaults unchanged from previous commit.

Signed-off-by: Patrick <patrick@rtk.ai>
Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

---------

Signed-off-by: Patrick <patrick@rtk.ai>
Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* feat(.claude): add /rtk-triage skill — cross-analysis of PRs and issues (rtk-ai#662)

* feat(.claude): add /rtk-triage skill — orchestrated PR+issue cross-analysis

New skill that runs issue-triage + pr-triage in parallel then produces
a cross-analysis layer that neither skill can do individually:

- Double coverage detection: identifies when 2+ PRs target the same issue
  (via body scan + file overlap), recommends which to keep/close
- Security gap detection: for security review issues, maps each finding
  to a PR (or flags it as uncovered)
- P0/P1 bugs without PR: groups by pattern to suggest sprint batching
- Our dirty PRs: identifies probable cause (conflict with sibling PR,
  needs rebase, missing linked issue)

Output is saved automatically to claudedocs/RTK-YYYY-MM-DD.md.

Usage: /rtk-triage           (French, auto-save)
       /rtk-triage en        (English output)

Signed-off-by: Florian Bruniaux <florian@bel-etage.com>
Signed-off-by: Florian BRUNIAUX <florian@bruniaux.com>

* docs(architecture): update module count to 66

Sync ARCHITECTURE.md with current main.rs state.
Previous count (60) was stale since several modules were added
(dotnet_cmd, dotnet_format_report, dotnet_trx, npm_cmd, gt_cmd, etc.).

Signed-off-by: Florian Bruniaux <florian@bel-etage.com>
Signed-off-by: Florian BRUNIAUX <florian@bruniaux.com>

---------

Signed-off-by: Florian Bruniaux <florian@bel-etage.com>
Signed-off-by: Florian BRUNIAUX <florian@bruniaux.com>

* fix: subcommand routing drops unrecognized subcommands (rtk-ai#600) (rtk-ai#601)

- git stash: pass unknown subcommands (save, branch, clear) through
  instead of silently falling back to git stash push
- git branch: add --show-current, --set-upstream-to, --format, --sort
  to flag detection so they don't get overridden by -a injection
- pip: replace bail!() with passthrough for unknown subcommands
  (freeze, download, wheel, etc.)

Fixes rtk-ai#600

Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* fix: resolve cargo fmt + 54 clippy warnings blocking CI (rtk-ai#663)

cargo fmt diffs in config.rs, git.rs, playwright_cmd.rs were failing
the fmt CI check, which cascaded to block clippy/test/security on
PRs rtk-ai#632, rtk-ai#635, rtk-ai#638. Also fixes all clippy warnings: dead code
annotations, iterator simplifications, assert patterns, and
unnecessary allocations.

Signed-off-by: Patrick Szymkowiak <patrick@rtk-ai.app>
Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* fix: discover absolute paths + git global options (rtk-ai#485, rtk-ai#163) (rtk-ai#518)

* fix: discover classifies absolute paths like /usr/bin/grep (rtk-ai#485)

Normalize absolute binary paths before classification:
/usr/bin/grep → grep, /bin/ls → ls, /usr/local/bin/git → git

Adds strip_absolute_path() helper + 5 tests.

Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* fix: discover and rewrite support git global options -C, --no-pager, etc. (rtk-ai#163)

Strip git global options (-C <path>, -c <key=val>, --git-dir, --work-tree,
--no-pager, --no-optional-locks, --bare, --literal-pathspecs) before
classification so git -C /tmp status is recognized as rtk git.

Rewrite preserves global options: git -C /tmp status → rtk git -C /tmp status

Adds GIT_GLOBAL_OPT lazy_static regex + strip_git_global_opts() helper + 6 tests.

Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

---------

Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* fix: prevent double `--` separator in cargo clippy with -p flags (rtk-ai#519)

When running `rtk cargo clippy -p my-crate -- -D warnings`, Clap with
`trailing_var_arg = true` preserves the `--` in parsed args when flags
precede it. `restore_double_dash()` then added a second `--`, producing
`cargo clippy -p my-crate -- -- -D warnings`. This caused rustc to
interpret `-D` as a filename instead of a lint flag.

Fix: skip restoration when args already contain `--` (Clap preserved it).

Fixes rtk-ai#496

Signed-off-by: Ousama Ben Younes <benyounes.ousama@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* ci: add PR template + target branch check (rtk-ai#521)

- PR template reminds contributors to target develop
- CI workflow labels PRs targeting master with 'wrong-base' and posts a comment
- Excludes develop→master PRs (maintainer releases)

Signed-off-by: Patrick <patrick@rtk-ai.com>
Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* fix: prevent rtk read from corrupting JSON/YAML/data files (rtk-ai#522)

Add Language::Data variant for data formats (JSON, YAML, TOML, XML, CSV, etc.)
with empty comment patterns to prevent comment stripping. AggressiveFilter
falls back to MinimalFilter for data files.

Fixes rtk-ai#464

Signed-off-by: Ousama Ben Younes <benyounes.ousama@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* fix: skip rewriting find/fd in pipes to preserve xargs compatibility (rtk-ai#439) (rtk-ai#563)

rtk find outputs a grouped format incompatible with pipe consumers
like xargs, grep, wc, sort. Skip rewrite when find/fd is followed
by a pipe, preserving native one-per-line output.

Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* fix: add hint when git diff is truncated + fix --no-compact passthrough (rtk-ai#427) (rtk-ai#564)

When compact_diff truncates output, append a hint line so Claude knows
how to get the full diff: [full diff: rtk git diff --no-compact]

Also fix --no-compact flag being passed to git (causing usage error)
and remove decorative emoji from compact_diff output.

Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* fix: propagate exit codes in git diff, status+args, commit, and branch (rtk-ai#632)

4 P1 bugs where git exit codes were swallowed:
- git diff: failure silently printed empty stat output
- git status (with args): failure was filtered instead of propagated
- git commit: failure printed "FAILED" but returned Ok(()) breaking pre-commit hooks
- git branch (list mode): failure was silently ignored

All now follow the established pattern: eprint stderr, track raw==raw, process::exit(code).

Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* feat: add 5 new TOML filters (ollama, nx, gradle, spring-boot, jira) (rtk-ai#635)

* feat: add 5 new TOML built-in filters (ollama, nx, gradle, spring-boot, jira)

New filters for commands not covered by Rust modules:
- ollama: strip ANSI spinners, keep final text response (rtk-ai#624)
- nx: strip Nx monorepo noise, keep build results (rtk-ai#444)
- gradle/gradlew: strip UP-TO-DATE tasks, keep build summary (rtk-ai#147)
- spring-boot: strip banner and verbose logs, keep startup/errors (rtk-ai#147)
- jira: strip blanks, truncate wide columns (rtk-ai#524)

All 5 filters pass inline tests via rtk verify (123/123).
Updated builtin filter count: 47 -> 52.

Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* feat: add 5 more TOML filters (turbo, mise, just, task, yadm)

New filters for task runners and git wrapper:
- turbo: strip cache/Tasks/Duration noise, keep task output (rtk-ai#531)
- mise: strip install/download progress, keep task results (rtk-ai#607)
- just: strip blanks and recipe headers, keep output (rtk-ai#607)
- task: strip task headers and up-to-date lines, keep results (rtk-ai#607)
- yadm: strip hint lines, compact git-like output (rtk-ai#567)

All verified with fake binaries through catch-all TOML engine.
137/137 TOML tests pass, 934 Rust tests pass.
Updated builtin filter count: 52 -> 57.

Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

---------

Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

* fix: replace emojis with plain text in git status output (rtk-ai#603) (rtk-ai#638)

Git status output used emojis (📌, 📝, ❓, ✅, ⚠️) that confuse
non-Claude LLMs (GPT, etc.) causing retry loops. Replace with plain
text labels (branch:, modified:, staged:, untracked:, conflicts:).

Also add "clean — nothing to commit" when working tree is clean,
so LLMs understand the repo state without ambiguity.

Before: 📌 master
After:  branch: master
        clean — nothing to commit

Fixes rtk-ai#603

Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>

---------

Signed-off-by: Patrick <patrick@rtk.ai>
Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>
Signed-off-by: Florian Bruniaux <florian@bel-etage.com>
Signed-off-by: Florian BRUNIAUX <florian@bruniaux.com>
Signed-off-by: Patrick Szymkowiak <patrick@rtk-ai.app>
Signed-off-by: Ousama Ben Younes <benyounes.ousama@gmail.com>
Signed-off-by: Patrick <patrick@rtk-ai.com>
Co-authored-by: Florian BRUNIAUX <florian@bruniaux.com>
Co-authored-by: Ben Younes <benyounes.ousama@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: rtk read corrupts package.json when JSON strings contain /* or */

2 participants