Agent retries denied actions instead of abandoning the rule class

> **Follow-up from [PR #88](https://github.com/aws-samples/sample-autonomous-cloud-coding-agents/pull/88)** — observed during the 2026-05-12 E2E Phase 4 (deny path) test run; documented in the test plan scorecard. Filed as **HITL UX bug** since it directly impacts the usability of the security model PR #88 shipped: if denying creates approval fatigue, operators become reluctant to deny, and the soft-deny gate erodes from "an explicit security check" to "a friction point operators learn to bypass." That undermines the entire three-outcome design.

## Functional description

When an operator denies a Cedar HITL approval request with `bgagent deny <task-id> <request-id> --reason "..."`, the agent receives the deny decision and continues running. **The agent's current default behavior is to interpret the deny as "this specific tool call was rejected — try a different approach"** rather than "this class of action is forbidden — abandon this strategy."

**Concrete observation (E2E run, task `01KRET3HRC6YQ3TB7YPVPA9NZR`):**

- Agent attempts `git push --force origin some-branch`.
- `force_push_any` rule triggers; approval requested; operator denies with reason "Force push not appropriate for this task."
- Agent receives the deny. Instead of giving up on force-push, **agent retries with a slightly different command** (different branch name, different flags). New approval request fires.
- Operator denies again. Agent retries a third time (gets timed out before the operator can deny again).
- Net result: 1 DENIED + 1 TIMED_OUT + 1 DENIED rows on the same `force_push_any` rule, all within one task. Agent eventually hits `max_turns` and the task fails.

**User-visible impact:**

- Operator approval fatigue — being asked the "same question" 3 times for what feels like the same intent.
- Tasks burn budget on retries that the operator has already implicitly forbidden.
- The audit trail shows multiple denied rows that conceptually represent a single rejected intent.

## Technical context

**Why the agent retries:** the deny reason is surfaced to the agent as part of its conversation context, but the prompt today doesn't strongly direct the agent to **abandon the entire class of action**. The agent's reasoning is something like "the operator denied that specific command — maybe they'd allow a different version of it." That's not unreasonable behavior in general, but for security-flavored denies it's the wrong heuristic.

**Where the prompt lives:**
- `agent/prompts/` — the system prompt template variants.
- The deny-injection happens via the Cedar engine's REQUIRE_APPROVAL → DENY path in `agent/src/policy.py` and `agent/src/hooks.py`; the deny reason is threaded into the `_deny_response` message that becomes the next user-message.

**Two viable fixes:**

1. **Prompt-level fix (cheaper):** strengthen the deny-reason text injected into the agent's context. Instead of "The operator denied this with reason: <text>", use language like:
   > "**This class of action has been REJECTED by the operator.** Reason: <text>. **DO NOT retry this action with variations.** Find an alternative task strategy that does not require this kind of operation, or fail gracefully if no alternative exists."
2. **Engine-level fix (more invasive):** when the engine sees a DENIED outcome on a (rule_id, tool_name) pair, cache it for the rest of the session. Subsequent attempts at the same rule + tool combination auto-deny without prompting the operator.

These aren't mutually exclusive. The prompt-level fix should land first as it's strictly additive; the engine-level cache is an optimization on top.

**Edge case to consider:** what if the operator wants the agent to retry? Today they can use `--scope rule:<rule_id>` on the approve to whitelist that rule for the rest of the session. So "retry-friendly" already has an explicit signal; "deny" should be the unambiguous "stop trying" signal.

## Proposed approach

**Phase 1 (prompt-level fix):**
1. Update the deny-reason injection in `agent/src/hooks.py` to use the strengthened language.
2. Add a regression test that verifies the new text shape.
3. Validate via E2E: re-run T4 with the new prompt, confirm the agent doesn't retry the same `force_push_any` rule within the task.

**Phase 2 (engine-level cache, conditional on phase 1 not being enough):**
- Add a `(rule_id, tool_name) → DENIED` set to the `PolicyEngine` for the session.
- On approval-flow entry, check the set first; if present, return DENY without prompting.
- Per-rule scope `--scope rule:X` on approve clears that key from the set.
- This is the IMPL-23 recent-decision cache extended to per-rule semantics; the cache layer already exists.

## Acceptance criteria

- [ ] Phase 1: deny-reason injection uses strengthened "abandon this class of action" language
- [ ] Phase 1: regression test on the new injection
- [ ] Phase 1: E2E re-run of T4 shows the agent does NOT retry the same rule within one task
- [ ] (If phase 1 isn't enough) Phase 2: engine caches DENIED (rule_id, tool_name) per session and fast-denies subsequent attempts
- [ ] Operator-facing behavior is documented: "deny means don't retry; if you want retry-friendly, use approve with `--scope this_call`"

## Out of scope

- Multi-task deny memory (denies in task A don't affect task B).
- Rule-level configuration of "deny means stop" vs "deny means try variation" (over-engineering today).
- Bulk deny ("deny everything matching this rule going forward at the operator level") — separate UX.

## References

- `agent/src/hooks.py` (deny injection point)
- `agent/src/policy.py` (REQUIRE_APPROVAL path; IMPL-23 cache layer is here)
- `.e2e-test-plan.md` Phase 4 scorecard for `01KRET3HRC6YQ3TB7YPVPA9NZR` (the originating observation)
- `docs/design/CEDAR_HITL_GATES.md` §6.5 (approval flow)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agent retries denied actions instead of abandoning the rule class #113

Functional description

Technical context

Proposed approach

Acceptance criteria

Out of scope

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Agent retries denied actions instead of abandoning the rule class #113

Description

Functional description

Technical context

Proposed approach

Acceptance criteria

Out of scope

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions