Skip to content

Enhance PolicyEngine with shell command parsing for granular policies #7838

@allenhutchison

Description

@allenhutchison

Summary

Enhance the PolicyEngine to leverage existing shell parsing utilities for more granular and sophisticated policy rules on shell commands. This will enable policies based on specific commands within compound statements, operators used, and command patterns.

Context

The codebase already has robust shell parsing capabilities:

  • shell-utils.ts provides splitCommands(), getCommandRoot(), and pattern detection
  • shell-quote library is installed and can parse complex shell syntax into AST-like structures
  • The shell tool currently extracts root commands for allowlisting

The PolicyEngine should leverage these capabilities to enable more sophisticated policies.

Proposed Implementation

1. Add Shell Command Analysis to PolicyEngine

Create a shell analysis utility that extracts:

  • Individual commands from compound statements
  • Operators used (&&, ||, |, ;, &)
  • Dangerous patterns (pipes, background jobs, command substitution)
  • Command arguments and flags

2. Extend PolicyRule for Shell-Specific Matching

interface PolicyRule {
  // Existing fields...
  
  // New shell-specific fields
  shellCommand?: string;        // Match specific shell commands (e.g., "rm", "git")
  shellPattern?: RegExp;        // Match command patterns (e.g., /^rm.*-rf/)
  allowPipes?: boolean;         // Policy on piped commands
  allowBackground?: boolean;    // Policy on background jobs
  allowConditionals?: boolean;  // Policy on && and || operators
  maxChainLength?: number;      // Limit command chain length
}

3. Example Policy Configurations

// Deny any rm command with -rf flags
{
  shellCommand: 'rm',
  shellPattern: /-rf|-fr/,
  decision: PolicyDecision.DENY
}

// Allow git commands but not in pipes
{
  shellCommand: 'git',
  allowPipes: false,
  decision: PolicyDecision.ALLOW
}

// Require confirmation for compound commands
{
  allowConditionals: true,
  maxChainLength: 2,
  decision: PolicyDecision.ASK_USER
}

4. Integration with Tool Confirmation Message Bus

When a shell command is submitted:

  1. Parse the command using existing utilities
  2. Extract all root commands and operators
  3. Check each command against PolicyRules
  4. Apply the highest priority matching rule
  5. Consider operator-based rules (pipes, conditionals)

Implementation Details

Utility Function

import { parse } from 'shell-quote';
import { getCommandRoots, detectCommandSubstitution } from '../utils/shell-utils.js';

export function analyzeShellCommand(command: string) {
  const roots = getCommandRoots(command);
  const parsed = parse(command);
  
  const operators = parsed.filter(item => 
    typeof item === 'object' && 'op' in item
  ).map(item => item.op);
  
  return {
    rootCommands: roots,
    operators,
    hasPipes: operators.includes('|'),
    hasConditionals: operators.includes('&&') || operators.includes('||'),
    hasBackground: operators.includes('&'),
    hasSubstitution: detectCommandSubstitution(command),
    chainLength: roots.length,
    parsed
  };
}

PolicyEngine Enhancement

The PolicyEngine.check() method would:

  1. Detect if the tool call is a shell command
  2. Analyze the command structure
  3. Apply shell-specific rules in addition to general rules
  4. Return the appropriate decision based on the analysis

Benefits

  • Granular Control: Policies can target specific dangerous commands or patterns
  • Security: Better detection and prevention of risky command combinations
  • Flexibility: Different rules for different command types and operators
  • User Experience: More intelligent confirmation prompts based on actual risk

Dependencies

Acceptance Criteria

  • PolicyEngine can analyze shell commands using existing parsing utilities
  • PolicyRules support shell-specific matching criteria
  • Shell commands are evaluated against both general and shell-specific rules
  • Compound commands are properly analyzed for each component
  • Dangerous patterns trigger appropriate policy decisions
  • Unit tests cover various shell command scenarios
  • Performance remains acceptable for complex commands

Related Issues

/cc @allenhutchison

Metadata

Metadata

Labels

area/agentIssues related to Core Agent, Tools, Memory, Sub-Agents, Hooks, Agent Qualitypriority/p2Important but can be addressed in a future release.

Type

No type
No fields configured for issues without a type.

Projects

Status

Closed

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions