Skip to content

OWASP Agentic Top 10 compliance evaluation: gap analysis across gh-aw, firewall, and mcpg #28770

@lpcox

Description

@lpcox

Summary

Evaluate gh-aw's compliance against the OWASP Agentic Top 10 (2026) using the Microsoft Agent Governance Toolkit's compliance mapping as the reference framework. This issue documents current coverage, identifies gaps, and proposes changes across the three gh-aw components: gh-aw (compiler/CLI), gh-aw-firewall (AWF sandbox), and gh-aw-mcpg (MCP gateway).

Overall Assessment

gh-aw provides strong defense-in-depth coverage across all 10 OWASP categories. Coverage is strongest in supply chain (SHA pinning), identity/privilege management (role checks + permissions), and rogue agent detection (audit + threat detection). Gaps are concentrated in runtime behavioral monitoring, inter-agent encryption, and cascading failure resilience.


Coverage by OWASP Category

ASI-01: Agent Goal Hijack ✅ Strong Coverage

Attackers manipulate the agent's objectives via indirect prompt injection or poisoned inputs.

Current controls:

  • Expression allowlist (expression_safety_validation.go) — hard-coded list of ~20 approved expression prefixes; all others rejected at compile time
  • Template injection detection (template_injection_validation.go) — prevents ${{ }} expressions in shell commands; forces env: binding
  • Markdown security scanner (markdown_security_scanner.go) — detects 6 attack categories: unicode abuse, hidden content, obfuscated links, HTML injection, embedded files, social engineering
  • Input sanitization (sanitize_incoming_text*.go) — @mention neutralization, URL redaction, domain filtering; fuzz-tested
  • Safe input/output boundary — all trigger content (issue title/body, PR body, comments) is sanitized before reaching the agent prompt

Gaps:

Gap Severity Component Description
No runtime prompt injection detection Medium gh-aw-mcpg The MCP gateway does not inspect tool responses for indirect prompt injection (e.g., a web page fetched by the agent containing "ignore previous instructions"). The Microsoft framework uses a policy engine that intercepts every action. gh-aw's defenses are compile-time (expression safety) and input-boundary (sanitization), but tool responses flowing back to the agent are not inspected.
No policy engine for runtime action interception Low gh-aw gh-aw does not have a general-purpose runtime policy engine that evaluates each agent action against declarative rules. The closest equivalent is the threat detection job, but it runs after the agent completes, not inline. This is acceptable for the current architecture (single-agent, sandboxed) but would be a gap for multi-agent orchestration.

ASI-02: Tool Misuse & Exploitation ✅ Strong Coverage

An agent's authorized tools are abused in unintended ways.

Current controls:

  • Tool allowlisting (tools_validation.go) — all tools explicitly declared in frontmatter; anonymous tools rejected
  • Bash command restrictionsbash: true|false|["cmd1","cmd2"] granular control
  • AWF network firewall — domain allowlist/denylist with SSL Bump HTTPS inspection
  • MCP gateway tool filtering — only configured MCP tools are exposed to the agent
  • Network egress control (network_firewall_validation.go) — compile-time validation of allowed domains

Gaps:

Gap Severity Component Description
No tool input/output sanitization at gateway Medium gh-aw-mcpg The MCP gateway proxies tool calls but does not inspect tool inputs for injection patterns (e.g., shell metacharacters in a bash tool argument crafted by the agent). The Microsoft framework includes input sanitization and command injection detection in the proxy layer. AWF's firewall handles network-level controls but not semantic tool-call inspection.
No tool call rate limiting Low gh-aw-mcpg Individual tool calls are not rate-limited at the gateway. An agent could make hundreds of search_code or list_issues calls in a loop. The Microsoft framework uses per-tool rate limits as a defense against data exfiltration via read operations.

ASI-03: Identity & Privilege Abuse ✅ Strong Coverage

Agents escalate privileges by abusing identities or inheriting excessive credentials.

Current controls:

  • Role-based access (role_checks.go) — mandatory team membership verification before agent execution
  • Permission tiers (permissions.go) — 30+ GitHub Actions scopes with explicit read/write separation
  • GitHub App scoping (github_app_permissions_validation.go) — App-only scopes enforced
  • Token precedence (github_token.go) — layered token resolution with explicit precedence
  • Write-via-safe-outputs — write operations go through safe-outputs, not direct token access
  • Secret validation (validate_multi_secret.sh) — verifies token format (fine-grained PAT prefix) in activation

Gaps:

Gap Severity Component Description
No live token validity check Medium gh-aw Secret validation checks token existence and format (prefix) but does not make a live API call to verify the token is not expired, revoked, or missing required scopes. Expired tokens are only discovered deep into the agent job. A lightweight preflight API call in the activation job would fast-fail.
No agent-level identity (DID) Low gh-aw Agents do not have cryptographic identities. Each workflow run uses the GITHUB_TOKEN or a PAT, but there is no per-agent identity that could be used for trust scoring or delegation chains. This is a gap relative to the Microsoft framework's DID model but is low priority given gh-aw's single-agent-per-run architecture.

ASI-04: Agentic Supply Chain Vulnerabilities ✅ Very Strong Coverage

Vulnerabilities in third-party tools, plugins, or dependencies.

Current controls:

  • SHA pinning (pkg/actionpins/) — all GitHub Actions pinned to specific commit SHAs, not version tags
  • Container image pinning — Docker images use digest-based references
  • Lock file integrity (lock_validation.go) — compiled .lock.yml files are tracked in git
  • Cache integrity (cache_integrity.go) — guard policy hashes isolate cache across configurations
  • Action pin registry — embedded pin table; no runtime resolution

Gaps:

Gap Severity Component Description
No AI-BOM / model provenance tracking Low gh-aw gh-aw does not generate or track an AI Bill of Materials (model version, training data provenance, weights hashing). The Microsoft framework includes comprehensive AI-BOM with SLSA build provenance. For gh-aw, tracking the engine version + model name in aw_info.json partially covers this, but a formal AI-BOM is not generated.
No MCP server provenance verification Medium gh-aw-mcpg Third-party MCP servers configured via tools: are fetched at runtime without cryptographic verification. There is no equivalent of SRI hashes or signature verification for MCP server binaries/containers. A malicious or compromised MCP server could be injected via registry poisoning.

ASI-05: Unexpected Code Execution ✅ Strong Coverage

Agents trigger remote code execution through tools, interpreters, or APIs.

Current controls:

  • Mandatory sandbox — containerized execution (SRT) by default via AWF
  • Mount validation (sandbox_validation.go) — strict source:destination:ro|rw format
  • Firewall gateway — AWF provides network isolation around agent execution
  • Expression safety — no dynamic code evaluation in compiled workflows
  • Step order validation (step_order_validation.go) — ensures secret redaction before artifact uploads

Gaps:

Gap Severity Component Description
No per-tool-call resource limits (CPU/memory/time) Medium gh-aw-firewall AWF enforces network egress control but does not impose per-tool-call CPU, memory, or time limits. The stop-after feature limits total workflow duration, but an individual tool call (e.g., a bash command running a fork bomb) could exhaust runner resources before the time limit triggers. The Microsoft framework uses Ring-based execution with per-invocation resource quotas.
No execution ring isolation Low gh-aw-firewall All tools run at the same privilege level within the container. There is no tiered execution model (e.g., Ring 0-3) where untrusted operations run with fewer capabilities than trusted ones.

ASI-06: Memory & Context Poisoning ⚠️ Moderate Coverage — Largest Gap Area

Persistent memory or long-running context is poisoned with malicious instructions.

Current controls:

  • Repo memory isolation (repo_memory.go) — per-memory tokens, file glob patterns, size limits (10KB/file, 100KB patch, 100 files)
  • Extension whitelist — only .json, .jsonl, .txt, .md, .csv allowed in repo memory
  • Cache memory validation — restore keys isolated per guard policy
  • Threat detection — separate isolated job scans outputs before safe-outputs deployment
  • Safe output size limits — max body length, max file count

Gaps:

Gap Severity Component Description
Repo memory content not scanned for prompt injection on read High gh-aw When a workflow reads from repo memory (persisted from a previous run), the content is not scanned for prompt injection before being injected into the agent's context. A previous compromised run could write poisoned instructions to repo memory that would hijack the next run. The Microsoft framework uses CMVK (Cross-Model Verification Kernel) to verify claims across multiple models.
Cache memory content not scanned on read Medium gh-aw Cache memory (/tmp/gh-aw/cache-memory/) persists across runs and is read back into agent context without sanitization. Similar to repo memory, this is a vector for cross-run context poisoning.
No immutable memory audit trail Low gh-aw Repo memory changes are committed to git (providing history), but cache memory changes are not audited. The Microsoft framework uses hash-chain verification for all memory mutations.

ASI-07: Insecure Inter-Agent Communication ⚠️ Moderate Coverage

Agents collaborate without adequate authentication, confidentiality, or validation.

Current controls:

  • MCP gateway authentication — API key authentication between agent and gateway
  • Gateway containerization — MCP gateway runs in an isolated Docker container
  • DIFC proxy (compiler_difc_proxy.go) — integrity-filtered gh CLI routing with TLS enforcement
  • Version pinning — gateway image version pinned (not :latest)

Gaps:

Gap Severity Component Description
No E2E encryption for agent↔gateway communication Medium gh-aw-mcpg Communication between the agent process and the MCP gateway uses HTTP over localhost (not HTTPS). While this is within the same container/runner, a process on the runner could sniff the traffic. The Microsoft framework uses Signal protocol for E2E encryption. For gh-aw, mTLS between agent and gateway would be a proportionate improvement.
No message signing/verification Medium gh-aw-mcpg MCP tool call requests and responses are not cryptographically signed. A man-in-the-middle on the runner could tamper with tool responses. The Microsoft framework uses IATP with Ed25519 signatures.
No trust scoring for MCP servers Low gh-aw-mcpg All configured MCP servers are treated as equally trusted. There is no reputation or trust scoring system that could downgrade a server exhibiting anomalous behavior.

ASI-08: Cascading Failures ⚠️ Moderate Coverage

An initial error triggers multi-step compound failures across chained agents.

Current controls:

  • Stop-after time limits (stop_after.go) — relative time bounds (+1d, +8h)
  • Concurrency control (concurrency.go) — cancel-in-progress prevents duplicate runs
  • Threat detection isolation — separate job prevents compromised outputs from deploying
  • Max turns validation — agent turn limits prevent infinite loops

Gaps:

Gap Severity Component Description
No circuit breaker for repeated failures High gh-aw If a workflow fails repeatedly (e.g., 10 consecutive failures), it continues to be triggered on every matching event, consuming compute and tokens. There is no circuit breaker that auto-disables a workflow after N consecutive failures. The Microsoft framework implements circuit breakers that isolate failing agents automatically. This could be implemented as a pre-activation check that queries recent run history.
No SLO/error budget tracking Medium gh-aw There is no built-in concept of error budgets or SLOs per workflow. The token audit workflow provides observability, but there is no automated intervention when a workflow exceeds its failure budget.
No cascading failure detection for orchestrator workflows Medium gh-aw Orchestrator workflows (compiler_orchestrator_workflow.go) dispatch child workflows but do not monitor for cascading failure patterns (e.g., child A fails → child B retries → child C overwhelmed).

ASI-09: Human-Agent Trust Exploitation ✅ Strong Coverage

Attackers leverage misplaced user trust in agents to authorize dangerous actions.

Current controls:

  • Manual approval (manual_approval.go) — GitHub environment protection rules integration
  • Role verification — team membership checked before execution
  • Reaction-based confirmation — approve via emoji reactions on issues/PRs
  • Safe-outputs boundary — all GitHub write operations go through safe-outputs with human-auditable output
  • Status comments — visible progress updates for human oversight

Gaps:

Gap Severity Component Description
No risk-based approval escalation Low gh-aw All manual approvals use the same mechanism regardless of action risk level. The Microsoft framework classifies actions as critical/high/medium/low and requires quorum approval for critical actions. gh-aw could add a risk-level frontmatter field that gates high-risk operations behind stricter approval requirements.
No approval expiration tracking Low gh-aw GitHub environment approvals do not have explicit expiration within gh-aw's control. The Microsoft framework tracks approval expiration to prevent stale authorizations.

ASI-10: Rogue Agents ✅ Very Strong Coverage

Agents operating outside their defined scope.

Current controls:

  • Comprehensive auditing (pkg/cli/audit*.go) — downloads artifacts/logs, parses MCP usage, firewall logs, generates detailed reports
  • Threat detection (threat_detection.go) — isolated job scans agent outputs for anomalies
  • Firewall audit trail — tracks all blocked domains/URLs
  • Concurrency cancellation — immediate kill via cancel-in-progress
  • Stop-after time limits — temporal kill switch
  • Multi-run diff (audit_diff.go) — compare runs to detect behavioral drift

Gaps:

Gap Severity Component Description
No real-time behavioral monitoring Medium gh-aw-firewall Threat detection runs after the agent completes, not during execution. A rogue agent could exfiltrate data or cause damage before the post-hoc scan catches it. Real-time monitoring of tool call patterns (frequency, targets, data volume) during execution would enable earlier intervention.
No Merkle audit trail Low gh-aw Audit logs are stored as artifacts (Markdown reports) but are not hash-chained. A compromised runner could tamper with audit artifacts. The Microsoft framework uses Merkle trees for cryptographic proof of action history.

Gap Summary by Component

gh-aw (compiler/CLI) — 7 gaps

Priority Gap OWASP
🔴 High Repo memory content not scanned for prompt injection on read ASI-06
🔴 High No circuit breaker for repeatedly failing workflows ASI-08
🟡 Medium No live token validity check in activation preflight ASI-03
🟡 Medium Cache memory content not scanned on read ASI-06
🟡 Medium No SLO/error budget tracking per workflow ASI-08
🟡 Medium No cascading failure detection for orchestrator workflows ASI-08
🟢 Low No AI-BOM generation, risk-based approval escalation, Merkle audit trail, agent-level DID ASI-04/09/10/03

gh-aw-firewall (AWF sandbox) — 3 gaps

Priority Gap OWASP
🟡 Medium No per-tool-call resource limits (CPU/memory/time) ASI-05
🟡 Medium No real-time behavioral monitoring during execution ASI-10
🟢 Low No execution ring isolation (all tools same privilege) ASI-05

gh-aw-mcpg (MCP gateway) — 6 gaps

Priority Gap OWASP
🟡 Medium No prompt injection detection in tool responses ASI-01
🟡 Medium No tool input sanitization at gateway ASI-02
🟡 Medium No MCP server provenance verification ASI-04
🟡 Medium No E2E encryption for agent↔gateway communication ASI-07
🟡 Medium No message signing/verification for tool calls ASI-07
🟢 Low No tool call rate limiting, trust scoring for MCP servers ASI-02/07

Recommendations

Phase 1: High-Priority (address first)

  1. Repo/cache memory sanitization on read (gh-aw) — scan persisted memory for prompt injection patterns before injecting into agent context
  2. Circuit breaker for failing workflows (gh-aw) — pre-activation check that queries recent run history and skips execution after N consecutive failures

Phase 2: Medium-Priority

  1. Live token validity preflight (gh-aw) — lightweight API call in activation to verify tokens are not expired/revoked
  2. Tool response inspection (gh-aw-mcpg) — scan MCP tool responses for prompt injection patterns before returning to agent
  3. Per-tool resource limits (gh-aw-firewall) — configurable CPU/memory/time limits per tool invocation
  4. Agent↔gateway mTLS (gh-aw-mcpg) — encrypted communication between agent and MCP gateway
  5. MCP server provenance (gh-aw-mcpg) — hash verification for MCP server images/binaries

Phase 3: Lower-Priority Hardening

  1. AI-BOM generation, Merkle audit trails, execution rings, trust scoring

References

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions