feat: policy recommendation plumbing — denial aggregation, transport, approval pipeline, and mechanistic recommendations

## Summary

Build the end-to-end infrastructure for sandbox-initiated policy recommendations: denial aggregation in the sandbox, gRPC transport to the gateway, persistence, approval workflow, policy merge, and CLI/TUI for human-in-the-loop review. Includes a deterministic (no-LLM) chunk generator so the full pipeline is testable without inference configured.

This is the plumbing half of #153. The LLM-powered PolicyAdvisor agent harness is a follow-up issue that plugs into this infrastructure.

## Architecture

Follows **Option D** from the [design doc](https://gitlab-master.nvidia.com/-/snippets/12930) — sandbox aggregates denials locally, generates mechanistic recommendations, submits to gateway for persistence + approval.

```
SANDBOX                              GATEWAY                        USER
┌──────────────────────┐     ┌──────────────────────┐     ┌──────────────────┐
│                      │     │                      │     │                  │
│  proxy.rs deny event │     │                      │     │                  │
│       │              │     │                      │     │                  │
│       v              │     │                      │     │                  │
│  DenialAggregator    │     │                      │     │                  │
│    group by          │     │                      │     │                  │
│    (host,port,binary)│     │                      │     │                  │
│    dedup + cooldown  │     │                      │     │                  │
│    L7 event ingestion│     │                      │     │                  │
│       │              │     │                      │     │                  │
│       v              │     │                      │     │                  │
│  MechanisticMapper   │     │                      │     │                  │
│    host:port → rule  │     │                      │     │                  │
│    Stage 1: L7 audit │     │                      │     │                  │
│    Stage 2: refine   │     │                      │     │                  │
│       │              │     │                      │     │                  │
│       │ SubmitPolicy │     │                      │     │                  │
│       │ Analysis RPC │     │                      │     │                  │
│       └──────────────┼────>│  Validate + persist  │     │                  │
│                      │     │  DraftPolicyUpdate   │────>│  CLI: draft list │
│                      │     │       │              │     │  TUI: draft panel│
│                      │     │       v              │     │       │          │
│                      │     │  Approve/Reject RPCs │<────│  approve/reject  │
│                      │     │       │              │     │                  │
│                      │     │  Merge into active   │     │                  │
│  Policy poll ←───────┼─────│  policy              │     │                  │
│  OPA reload          │     │                      │     │                  │
└──────────────────────┘     └──────────────────────┘     └──────────────────┘
```

## Scope

### Proto definitions (all messages + RPCs)

New messages in `proto/`:
- `L7RequestSample` — observed HTTP method+path from L7 inspection
- `DenialSummary` — structured denial data from sandbox aggregator (host, port, binary, counts, L7 samples, cmdlines, denial_stage)
- `PolicyChunk` — proposed rule with rationale, status, stage, supersession
- `DraftPolicyUpdate` — new `SandboxStreamEvent` variant for live notifications
- Request/response pairs for all RPCs below

New RPCs on `Navigator` service:
- `SubmitPolicyAnalysis` — sandbox → gateway: atomic submission of denial summaries + proposed chunks
- `GetDraftPolicy` — CLI/TUI → gateway: query draft chunks with optional status filter
- `ApproveDraftChunk` / `RejectDraftChunk` — single-chunk approval/rejection
- `ApproveAllDraftChunks` — bulk approval (skips security-flagged unless forced)
- `EditDraftChunk` — modify a pending chunk in-place (e.g., narrow `allowed_ips`)
- `UndoDraftChunk` — reverse last approval
- `GetDraftHistory` — audit trail of all decisions

See design doc Section 10 for complete proto definitions.

### Gateway persistence layer

New tables:
- `draft_policy_chunks` — stores proposed chunks with status lifecycle (pending → approved/rejected/superseded)
- `denial_summaries` — stores aggregated denial data, upserted by `(sandbox_id, host, port, binary)`

Indexes for efficient querying by sandbox + status.

Extend the `Store` trait (`crates/navigator-server/src/persistence/mod.rs`) with methods for CRUD on both tables.

### Gateway gRPC handlers

In `crates/navigator-server/src/grpc.rs`:
- `SubmitPolicyAnalysis` handler: validate trust boundary (reject loopback/link-local hosts, rate limit per sandbox, format checks), DNS resolution for `resolved_ips` + `is_private_ip` annotation, persist summaries + chunks, increment draft version, publish `DraftPolicyUpdate` to `SandboxWatchBus`
- `GetDraftPolicy` handler: query chunks by sandbox, optional status filter
- `ApproveDraftChunk` handler: load current active policy, merge chunk's `proposed_rule` into `network_policies` map, persist via existing `UpdateSandboxPolicy` internals (deterministic_hash, put_policy_revision, supersede_older, notify watch_bus), update chunk status → approved, update denial_summaries status → resolved
- `RejectDraftChunk` handler: update chunk status → rejected, store rejection reason
- `ApproveAllDraftChunks` handler: iterate pending chunks, skip those with `security_notes` unless `include_security_flagged=true`, merge all into active policy
- `EditDraftChunk` handler: replace `proposed_rule` on a pending chunk
- `UndoDraftChunk` handler: remove merged rule from active policy, revert chunk to pending
- `GetDraftHistory` handler: return chronological decision log

### Gateway validation (trust boundary)

Per R7 from the design doc:
- Reject chunks with loopback (127.0.0.0/8) or link-local (169.254.0.0/16) hosts — use `is_internal_ip()` from `proxy.rs:956-974` as reference
- Rate limit: max 10 outstanding pending chunks per sandbox (`max_outstanding`)
- Format validation: rule names, endpoint fields, binary paths
- DNS re-verification: gateway resolves hostnames independently (sandbox DNS is untrusted), annotates `resolved_ips` and `is_private_ip` on denial summaries
- Pre-merge conflict detection: check if proposed rule overlaps existing `network_policies` entries (same host:port:binary)

### Sandbox DenialAggregator

New module in `crates/navigator-sandbox/src/`:
- Groups denial events by primary key `(host, port, binary)`
- Dedup window (default 60s) with threshold (default 3) before emission
- Cooldown (default 5m) between emissions for same key
- Count tracking: `window_count` (resets), `suppressed_count` (cooldown drops), `total_count` (cumulative, never resets)
- `persistent_threshold` (default 10): emit regardless of windowing for slow-drip patterns
- Memory bounds: `max_keys=1000`, overflow counter for flood protection
- Stale-flush: periodic sweep (30s) emits entries older than 5m with any activity
- L7 event ingestion: collect `(method, path, decision)` from `L7_REQUEST` tracing events (relay.rs:123-133) into `l7_request_samples` map, capped at 50 distinct pairs per entry
- Cmdline sanitization: redact Authorization, X-Vault-Token, Cookie, X-Api-Key headers, passwords, query string tokens before storage
- Two event sources: L4 CONNECT deny from proxy.rs, L7 audit/deny from relay.rs

Design doc Section 9d has the full state machine diagram.

### Sandbox mechanistic chunk generator

A deterministic mapper (no LLM) that converts denial summaries into proposed `PolicyChunk`s:

**Stage 1** (L4 denial → initial recommendation):
- For HTTP-capable ports (80, 443, 8080, 8200, 9200): recommend rule with `protocol: rest`, `tls: terminate` (443), `enforcement: audit`, `access: full` — this unblocks traffic while enabling L7 visibility
- For other ports: recommend plain L4 allow rule
- Rule name: `auto_{host}_{port}` (sanitized)
- Rationale: `"Denied {count} connections to {host}:{port} from {binary}"`

**Stage 2** (L7 audit data → refined recommendation):
- When an approved Stage 1 chunk has accumulated `l7_request_samples`:
  - All GET/HEAD/OPTIONS → `access: read-only`
  - Includes POST but no DELETE → `access: read-write`
  - Includes DELETE → `access: full`
- Set `enforcement: enforce`, `supersedes_chunk_id` pointing to the Stage 1 chunk
- Rationale: `"Observed {n} HTTP requests ({method_summary}). Recommending {access} access."`

This is intentionally simple — the LLM PolicyAdvisor (follow-up issue) replaces it with intelligent grouping, security analysis, and richer rationale.

### Sandbox gRPC client

Extend `CachedNavigatorClient` (`crates/navigator-sandbox/src/grpc_client.rs`) with:
- `submit_policy_analysis()` method: sends `SubmitPolicyAnalysisRequest` with denial summaries + proposed chunks
- Error handling: log and retry on transient failures, drop on permanent failures
- Include `analysis_mode: "mechanistic"` field

### Approval merge flow

When a chunk is approved:
1. Load current active `SandboxPolicy` (latest loaded version)
2. Insert chunk's `proposed_rule` into `network_policies` map under `chunk.rule_name`
3. Validate merged policy (static fields unchanged, no duplicate endpoint coverage)
4. Persist via existing `UpdateSandboxPolicy` internals
5. Update chunk status → approved, denial_summaries status → resolved
6. Sandbox picks up new policy on next 30s poll cycle, reloads OPA

For chunk supersession (Stage 2 replacing Stage 1):
1. Approve the Stage 2 chunk (same merge flow)
2. The Stage 2 rule replaces the Stage 1 rule (same `rule_name`)
3. Mark the original Stage 1 chunk as `superseded`

### DraftPolicyUpdate streaming event

New `SandboxStreamEvent` variant sent via `WatchSandbox` stream:
- `draft_version`, `new_chunks` count, `total_pending` count, brief `summary`
- CLI `logs --tail` renders: `"Draft policy updated: N new chunks. Run 'openshell sandbox draft <name>' to review."`
- TUI shows notification badge on Draft tab

### CLI commands

New subcommands under `openshell sandbox draft`:

| Command | Description |
|---------|-------------|
| `sandbox draft <name>` | View the full living draft policy |
| `sandbox draft <name> --chunks` | View individual chunks with rationale |
| `sandbox draft <name> --chunk <id>` | View a specific chunk in detail |
| `sandbox draft approve <name> <chunk_id>` | Approve a specific chunk |
| `sandbox draft approve <name> --all` | Approve all pending (skips security-flagged) |
| `sandbox draft approve <name> --all --force` | Include security-flagged chunks |
| `sandbox draft reject <name> <chunk_id>` | Reject a specific chunk |
| `sandbox draft reject <name> <chunk_id> --reason "..."` | Reject with reason |
| `sandbox draft apply <name>` | Approve all + merge into active policy |
| `sandbox draft clear <name>` | Clear all pending chunks |
| `sandbox draft edit <name> <chunk_id> [--allowed-ips ...] [--access ...]` | Modify a pending chunk |
| `sandbox draft undo <name> <chunk_id>` | Reverse last approval |
| `sandbox draft history <name>` | Show decision history |

See design doc Section 8a for detailed CLI flow examples.

### TUI draft panel

New view mode in sandbox detail screen (keybinding cycle: `p` → `l` → `d` → `p`):
- List pending/approved/rejected chunks with rationale summaries
- Keybindings: `a` approve, `r` reject, `A` approve-all, `Enter` detail popup
- Chunk detail popup: full proposed YAML, rationale, denial event summary
- Live updates via `DraftPolicyUpdate` stream event
- Status bar notification when new chunks arrive

See design doc Section 8b for TUI mockups.

## Codebase references

| Area | File | Lines |
|------|------|-------|
| Proxy deny path | `crates/navigator-sandbox/src/proxy.rs` | 237-299 |
| L7 relay events | `crates/navigator-sandbox/src/l7/relay.rs` | 123-133 |
| Log push | `crates/navigator-sandbox/src/log_push.rs` | 44, 93 |
| Policy poll loop | `crates/navigator-sandbox/src/lib.rs` | ~543-647 |
| gRPC client | `crates/navigator-sandbox/src/grpc_client.rs` | — |
| OPA evaluate | `crates/navigator-sandbox/src/opa.rs` | 227-237 |
| Gateway gRPC | `crates/navigator-server/src/grpc.rs` | 1223+ |
| Persistence Store | `crates/navigator-server/src/persistence/mod.rs` | — |
| Tracing bus | `crates/navigator-server/src/tracing_bus.rs` | — |
| SSRF check | `crates/navigator-sandbox/src/proxy.rs` | 956-974 |
| Proto: sandbox events | `proto/navigator.proto` | SandboxStreamEvent |
| Proto: policy rules | `proto/sandbox.proto` | NetworkPolicyRule |
| CLI commands | `crates/navigator-cli/src/main.rs` | SandboxCommands |
| CLI handlers | `crates/navigator-cli/src/run.rs` | — |
| TUI app | `crates/navigator-tui/src/app.rs` | Focus enum |
| Policy schema ref | `architecture/security-policy.md` | — |

## Design document

Full design: https://gitlab-master.nvidia.com/-/snippets/12930
Local copy: `architecture/plans/issue-153-policy-recommendations/00-deep-analysis.md`

## Effort estimate

~15-18 days (Phases 1, 2, 4, 5 from the design doc, minus LLM-specific parts)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: policy recommendation plumbing — denial aggregation, transport, approval pipeline, and mechanistic recommendations #204

Summary

Architecture

Scope

Proto definitions (all messages + RPCs)

Gateway persistence layer

Gateway gRPC handlers

Gateway validation (trust boundary)

Sandbox DenialAggregator

Sandbox mechanistic chunk generator

Sandbox gRPC client

Approval merge flow

DraftPolicyUpdate streaming event

CLI commands

TUI draft panel

Codebase references

Design document

Effort estimate

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Command	Description
`sandbox draft <name>`	View the full living draft policy
`sandbox draft <name> --chunks`	View individual chunks with rationale
`sandbox draft <name> --chunk <id>`	View a specific chunk in detail
`sandbox draft approve <name> <chunk_id>`	Approve a specific chunk
`sandbox draft approve <name> --all`	Approve all pending (skips security-flagged)
`sandbox draft approve <name> --all --force`	Include security-flagged chunks
`sandbox draft reject <name> <chunk_id>`	Reject a specific chunk
`sandbox draft reject <name> <chunk_id> --reason "..."`	Reject with reason
`sandbox draft apply <name>`	Approve all + merge into active policy
`sandbox draft clear <name>`	Clear all pending chunks
`sandbox draft edit <name> <chunk_id> [--allowed-ips ...] [--access ...]`	Modify a pending chunk
`sandbox draft undo <name> <chunk_id>`	Reverse last approval
`sandbox draft history <name>`	Show decision history

Area	File	Lines
Proxy deny path	`crates/navigator-sandbox/src/proxy.rs`	237-299
L7 relay events	`crates/navigator-sandbox/src/l7/relay.rs`	123-133
Log push	`crates/navigator-sandbox/src/log_push.rs`	44, 93
Policy poll loop	`crates/navigator-sandbox/src/lib.rs`	~543-647
gRPC client	`crates/navigator-sandbox/src/grpc_client.rs`	—
OPA evaluate	`crates/navigator-sandbox/src/opa.rs`	227-237
Gateway gRPC	`crates/navigator-server/src/grpc.rs`	1223+
Persistence Store	`crates/navigator-server/src/persistence/mod.rs`	—
Tracing bus	`crates/navigator-server/src/tracing_bus.rs`	—
SSRF check	`crates/navigator-sandbox/src/proxy.rs`	956-974
Proto: sandbox events	`proto/navigator.proto`	SandboxStreamEvent
Proto: policy rules	`proto/sandbox.proto`	NetworkPolicyRule
CLI commands	`crates/navigator-cli/src/main.rs`	SandboxCommands
CLI handlers	`crates/navigator-cli/src/run.rs`	—
TUI app	`crates/navigator-tui/src/app.rs`	Focus enum
Policy schema ref	`architecture/security-policy.md`	—

feat: policy recommendation plumbing — denial aggregation, transport, approval pipeline, and mechanistic recommendations #204

Description

Summary

Architecture

Scope

Proto definitions (all messages + RPCs)

Gateway persistence layer

Gateway gRPC handlers

Gateway validation (trust boundary)

Sandbox DenialAggregator

Sandbox mechanistic chunk generator

Sandbox gRPC client

Approval merge flow

DraftPolicyUpdate streaming event

CLI commands

TUI draft panel

Codebase references

Design document

Effort estimate

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions