Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions docs/api-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -316,6 +316,17 @@ curl http://localhost:9100/v1/auth/keys
curl -X DELETE http://localhost:9100/v1/auth/keys/key-abc123
```

### Rotate API Key

```bash
curl -X POST http://localhost:9100/v1/auth/keys/key-abc123/rotate \
-H "Authorization: Bearer $AEGIS_AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{"ttlDays": 365}'
```

Rotates an API key. Admin-only. Optionally set TTL in days. Returns the updated key metadata.

### Create SSE Token

```bash
Expand Down
46 changes: 45 additions & 1 deletion docs/enterprise.md
Original file line number Diff line number Diff line change
Expand Up @@ -437,8 +437,52 @@ Returns aggregate health of all active sessions including stalled and idle detec
---
### Production Alerting

Today, Aegis has no built-in alerting system (issue #1418). Until that ships, you can build alerting on top of the existing observability endpoints.
Aegis includes an **AlertManager** that tracks failure events and fires webhook
notifications when configurable thresholds are exceeded.

#### Alert Endpoints

```bash
# Test webhook configuration (admin/operator only)
curl -X POST http://localhost:9100/v1/alerts/test \
-H "Authorization: Bearer $AEGIS_AUTH_TOKEN"

# Get alert statistics (admin/operator/viewer)
curl -X GET http://localhost:9100/v1/alerts/stats \
-H "Authorization: Bearer $AEGIS_AUTH_TOKEN"
```

**AlertManager monitors:**
- Session failures (crashes, unexpected exits)
- Dead sessions (tmux process gone)
- Tmux crashes
- API error rate threshold breaches

**Authorization Requirements:**
- `POST /v1/alerts/test` — requires `admin` or `operator` role
- `GET /v1/alerts/stats` — requires `admin`, `operator`, or `viewer` role

**Configuration:**

Set webhook URLs via environment variable:
```bash
export AEGIS_ALERT_WEBHOOKS="https://example.com/alerts,https://backup.com/alerts"
export AEGIS_ALERT_FAILURE_THRESHOLD=5
export AEGIS_ALERT_COOLDOWN_MS=600000
```

Or via config.yaml:
```yaml
alerting:
webhooks:
- https://example.com/alerts
failureThreshold: 5
cooldownMs: 600000
```

Webhook payloads include severity, event type, session ID, and timestamp.

**Recommended external alerting setup:**
**Recommended alerting setup:**

```bash
Expand Down
Loading