fix(core): commandPrefix word boundary and compound command safety#15006
fix(core): commandPrefix word boundary and compound command safety#15006abhipatel12 merged 7 commits intomainfrom
Conversation
This change addresses two security/correctness issues in shell policy: 1. strict word boundaries for commandPrefix (e.g., 'git log' no longer matches 'git logout') 2. recursive safety checks for compound commands (e.g., 'git log && rm -rf /' is now properly vetted)
Summary of ChangesHello @allenhutchison, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the security and correctness of shell command execution within the system. It introduces more precise matching for command prefixes in policy rules and implements a robust mechanism to evaluate all components of compound shell commands. These changes collectively prevent potential security vulnerabilities by ensuring that only explicitly approved and safe shell operations are permitted, even when commands are chained together. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Size Change: +2.21 kB (+0.01%) Total Size: 21.6 MB
ℹ️ View Unchanged
|
There was a problem hiding this comment.
Code Review
This pull request introduces important security enhancements for shell command execution by enforcing strict word boundaries for commandPrefix and adding recursive safety checks for compound commands. The refactoring of shell permission logic into a separate shell-permissions.ts file is a good improvement for code organization.
However, I've identified a critical security vulnerability in the new compound command checking logic. If a command is crafted to be un-parseable by the shell parser but still matches an ALLOW rule on a prefix, it can bypass the safety checks. My review includes a specific comment and code suggestion to address this vulnerability by handling parser failures more safely.
This addresses PR feedback to ensure safe handling of un-parseable shell commands that might match an ALLOW prefix rule.
abhipatel12
left a comment
There was a problem hiding this comment.
Looks great overall, just a nit and a question about if we need a test file!
…cluding allowlisting, blocklisting, and injection prevention.
| toolName: 'run_shell_command', | ||
| // Mimic the regex generated by toml-loader for commandPrefix = ["git log"] | ||
| // Regex: "command":"git log(?:[\s"]|$) | ||
| argsPattern: /"command":"git log(?:[\s"]|$)/, |
There was a problem hiding this comment.
I'm concerned this will get out of sync with toml-loader. It may take some refactoring, but can we either:
- (Highest fidelity) Call the TOML loader here on a TOML string literal to generate the rules list.
- (Lower fidelity, maybe easier) Export the functionality that builds
argsPatternfromcommandPrefixand call that here.
| expect(result.decision).toBe(PolicyDecision.ASK_USER); | ||
| }); | ||
|
|
||
| it('SHOULD NOT allow "git log && rm -rf /" completely when prefix is "git log" (compound command safety)', async () => { |
There was a problem hiding this comment.
Also:
- Other combinators like
;and|| - Command substitution like
$(rm -rf /)and process substitution like>(rm -rf /)and<(rm -rf /). - Powershell tests?
I think at least some of these are already handled correctly, but we should be very, pedantically thorough in verifying this functionality.
| commandPrefixes.length > 0 | ||
| ? commandPrefixes.map( | ||
| (prefix) => `"command":"${escapeRegex(prefix)}`, | ||
| (prefix) => `"command":"${escapeRegex(prefix)}(?:[\\s"]|$)`, |
There was a problem hiding this comment.
This repetitive diff highlights that there's some code duplication here between the PolicyRule and SafetyCheckerRule processing that should be eliminated.
| commandPrefixes.length > 0 | ||
| ? commandPrefixes.map( | ||
| (prefix) => `"command":"${escapeRegex(prefix)}`, | ||
| (prefix) => `"command":"${escapeRegex(prefix)}(?:[\\s"]|$)`, |
There was a problem hiding this comment.
(?:[\\s"]|$) can be simplified to just [\\s"]. Considering the JSON encoding of the tool call parameters, $ should never match.
Taking a step back, though, I think it would be better to rework argsPattern entirely. Writing regexes against the JSON encoded args object is tricky and error-prone for both policy writers and Gemini CLI developers. For example, writing a regex for commandRegex has a number of oddities that could trip up a policy writer:
^and$do not match the start and end of the command string; in fact, they unconditionally fail to match.- The match is required to start at the beginning of the command string, but can end anywhere.
- The regex is actually matched against a JSON-escaped string of the command. In particular,
"is replaced with\", which could be a surprise.
Instead, a Record<string, RegExp> mapping top-level string properties (notably "command") to regexes would be much easier to use, and partially simplify the regex construction you have to do here. A more robust (but painful for policy writers) alternative or additional feature would be an argsSchema property that lets you provide a JSON schema for the args object.
| decision = this.applyNonInteractiveMode(rule.decision); | ||
| } | ||
| } else { | ||
| decision = this.applyNonInteractiveMode(rule.decision); |
There was a problem hiding this comment.
nit: this cascade of identical else clauses is a code smell. It feels like there should be a way to rephrase this as rule.decision being the initially set default decision for this pass through the loop, to avoid these else clauses.
| const command = (toolCall.args as { command?: string })?.command; | ||
| if (command) { | ||
| await initializeShellParsers(); | ||
| const subCommands = splitCommands(command); |
There was a problem hiding this comment.
I think the splitting into subcommands for shell tool calls should happen outside of the main "for each matching rule" loop:
- As written, the nested
checkcalls are unnecessarily repeated for each matching rule (if there's no DENY short-circuiting). - It would give the simpler (and I think sufficient?) semantics to shell tool calls that "each subcommand is checked independently."
- I'm not 100% certain this works correctly as written if the initial subcommand has no matching rules. For example, consider
git commit -m "..." && git push, where there's a DENY policy rule forgit pushand no policy rules forgit commit. If this code is never reached, the DENY for the subcommand is not triggered. FWIW I believe the way default behaviors are implemented as "catch all" rules means this doesn't happen, but that's non-local behavior so somewhat fragile (and definitely worthy of a test case!).
…compound command safety (google-gemini#15006)
This change addresses two security/correctness issues in shell policy:
Summary
Details
Related Issues
How to Validate
Pre-Merge Checklist