Skip to content

fix(core): commandPrefix word boundary and compound command safety#15006

Merged
abhipatel12 merged 7 commits intomainfrom
adh/bug/policy-engine-compound-commands
Dec 12, 2025
Merged

fix(core): commandPrefix word boundary and compound command safety#15006
abhipatel12 merged 7 commits intomainfrom
adh/bug/policy-engine-compound-commands

Conversation

@allenhutchison
Copy link
Copy Markdown
Contributor

@allenhutchison allenhutchison commented Dec 12, 2025

This change addresses two security/correctness issues in shell policy:

  1. strict word boundaries for commandPrefix (e.g., 'git log' no longer matches 'git logout')
  2. recursive safety checks for compound commands (e.g., 'git log && rm -rf /' is now properly vetted)

Summary

Details

Related Issues

How to Validate

Pre-Merge Checklist

  • Updated relevant documentation and README (if needed)
  • Added/updated tests (if needed)
  • Noted breaking changes (if any)
  • Validated on required platforms/methods:
    • MacOS
      • npm run
      • npx
      • Docker
      • Podman
      • Seatbelt
    • Windows
      • npm run
      • npx
      • Docker
    • Linux
      • npm run
      • npx
      • Docker

This change addresses two security/correctness issues in shell policy:
1. strict word boundaries for commandPrefix (e.g., 'git log' no longer matches 'git logout')
2. recursive safety checks for compound commands (e.g., 'git log && rm -rf /' is now properly vetted)
@allenhutchison allenhutchison requested a review from a team as a code owner December 12, 2025 19:10
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @allenhutchison, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the security and correctness of shell command execution within the system. It introduces more precise matching for command prefixes in policy rules and implements a robust mechanism to evaluate all components of compound shell commands. These changes collectively prevent potential security vulnerabilities by ensuring that only explicitly approved and safe shell operations are permitted, even when commands are chained together.

Highlights

  • Strict Word Boundaries for commandPrefix: The shell policy now enforces strict word boundaries for commandPrefix rules, preventing unintended matches. For example, a rule for 'git log' will no longer incorrectly match 'git logout'.
  • Recursive Safety Checks for Compound Commands: The policy engine now performs recursive safety checks for compound shell commands (e.g., command1 && command2). Each sub-command is individually vetted against security policies, ensuring that even if one part is allowed, a malicious or unapproved sub-command will still be caught.
  • Refactored Shell Permission Logic: The core logic for checking shell command permissions and allowlisting has been refactored into a new dedicated module, shell-permissions.ts, improving modularity and maintainability.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Dec 12, 2025

Size Change: +2.21 kB (+0.01%)

Total Size: 21.6 MB

Filename Size Change
./bundle/gemini.js 21.6 MB +2.21 kB (+0.01%)
ℹ️ View Unchanged
Filename Size
./bundle/sandbox-macos-permissive-closed.sb 1.03 kB
./bundle/sandbox-macos-permissive-open.sb 890 B
./bundle/sandbox-macos-permissive-proxied.sb 1.31 kB
./bundle/sandbox-macos-restrictive-closed.sb 3.29 kB
./bundle/sandbox-macos-restrictive-open.sb 3.36 kB
./bundle/sandbox-macos-restrictive-proxied.sb 3.56 kB

compressed-size-action

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces important security enhancements for shell command execution by enforcing strict word boundaries for commandPrefix and adding recursive safety checks for compound commands. The refactoring of shell permission logic into a separate shell-permissions.ts file is a good improvement for code organization.

However, I've identified a critical security vulnerability in the new compound command checking logic. If a command is crafted to be un-parseable by the shell parser but still matches an ALLOW rule on a prefix, it can bypass the safety checks. My review includes a specific comment and code suggestion to address this vulnerability by handling parser failures more safely.

Comment thread packages/core/src/policy/policy-engine.ts Outdated
This addresses PR feedback to ensure safe handling of un-parseable shell commands that might match an ALLOW prefix rule.
Copy link
Copy Markdown
Contributor

@abhipatel12 abhipatel12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great overall, just a nit and a question about if we need a test file!

Comment thread packages/core/src/utils/shell-permissions.ts Outdated
Comment thread packages/core/src/utils/shell-permissions.ts
Copy link
Copy Markdown
Contributor

@abhipatel12 abhipatel12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks!

@abhipatel12 abhipatel12 added this pull request to the merge queue Dec 12, 2025
Merged via the queue into main with commit a47af8e Dec 12, 2025
20 checks passed
@abhipatel12 abhipatel12 deleted the adh/bug/policy-engine-compound-commands branch December 12, 2025 23:20
toolName: 'run_shell_command',
// Mimic the regex generated by toml-loader for commandPrefix = ["git log"]
// Regex: "command":"git log(?:[\s"]|$)
argsPattern: /"command":"git log(?:[\s"]|$)/,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned this will get out of sync with toml-loader. It may take some refactoring, but can we either:

  • (Highest fidelity) Call the TOML loader here on a TOML string literal to generate the rules list.
  • (Lower fidelity, maybe easier) Export the functionality that builds argsPattern from commandPrefix and call that here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow Up in #15034

expect(result.decision).toBe(PolicyDecision.ASK_USER);
});

it('SHOULD NOT allow "git log && rm -rf /" completely when prefix is "git log" (compound command safety)', async () => {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also:

  • Other combinators like ; and ||
  • Command substitution like $(rm -rf /) and process substitution like >(rm -rf /) and <(rm -rf /).
  • Powershell tests?

I think at least some of these are already handled correctly, but we should be very, pedantically thorough in verifying this functionality.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow Up in #15034

commandPrefixes.length > 0
? commandPrefixes.map(
(prefix) => `"command":"${escapeRegex(prefix)}`,
(prefix) => `"command":"${escapeRegex(prefix)}(?:[\\s"]|$)`,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This repetitive diff highlights that there's some code duplication here between the PolicyRule and SafetyCheckerRule processing that should be eliminated.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow Up in #15034

commandPrefixes.length > 0
? commandPrefixes.map(
(prefix) => `"command":"${escapeRegex(prefix)}`,
(prefix) => `"command":"${escapeRegex(prefix)}(?:[\\s"]|$)`,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(?:[\\s"]|$) can be simplified to just [\\s"]. Considering the JSON encoding of the tool call parameters, $ should never match.


Taking a step back, though, I think it would be better to rework argsPattern entirely. Writing regexes against the JSON encoded args object is tricky and error-prone for both policy writers and Gemini CLI developers. For example, writing a regex for commandRegex has a number of oddities that could trip up a policy writer:

  • ^ and $ do not match the start and end of the command string; in fact, they unconditionally fail to match.
  • The match is required to start at the beginning of the command string, but can end anywhere.
  • The regex is actually matched against a JSON-escaped string of the command. In particular, " is replaced with \", which could be a surprise.

Instead, a Record<string, RegExp> mapping top-level string properties (notably "command") to regexes would be much easier to use, and partially simplify the regex construction you have to do here. A more robust (but painful for policy writers) alternative or additional feature would be an argsSchema property that lets you provide a JSON schema for the args object.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow Up in #15034

decision = this.applyNonInteractiveMode(rule.decision);
}
} else {
decision = this.applyNonInteractiveMode(rule.decision);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this cascade of identical else clauses is a code smell. It feels like there should be a way to rephrase this as rule.decision being the initially set default decision for this pass through the loop, to avoid these else clauses.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow Up in #15034

const command = (toolCall.args as { command?: string })?.command;
if (command) {
await initializeShellParsers();
const subCommands = splitCommands(command);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the splitting into subcommands for shell tool calls should happen outside of the main "for each matching rule" loop:

  • As written, the nested check calls are unnecessarily repeated for each matching rule (if there's no DENY short-circuiting).
  • It would give the simpler (and I think sufficient?) semantics to shell tool calls that "each subcommand is checked independently."
  • I'm not 100% certain this works correctly as written if the initial subcommand has no matching rules. For example, consider git commit -m "..." && git push, where there's a DENY policy rule for git push and no policy rules for git commit. If this code is never reached, the DENY for the subcommand is not triggered. FWIW I believe the way default behaviors are implemented as "catch all" rules means this doesn't happen, but that's non-local behavior so somewhat fragile (and definitely worthy of a test case!).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow Up in #15034

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants