From 190e609ecdb5a8513ecd2d930152da8703e37649 Mon Sep 17 00:00:00 2001 From: Cristian Magherusan-Stanciu Date: Sat, 25 Apr 2026 16:10:03 +0200 Subject: [PATCH] docs(specs): add multi-account execution draft spec MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit specs/multi-account-execution/ — design doc set covering CUDly's move from single-account-per-provider to N-account-per-provider across AWS / Azure / GCP. Status: Draft (2026-04-02). Nine documents: README.md — problem statement + scope + index acceptance.md — Gherkin-style acceptance criteria api.md — REST surface changes (account CRUD, overrides) backend.md — handler + scheduler + collection-pipeline impacts data-model.md — new DB tables + column additions + migrations frontend.md — UI flows (account list, per-account dashboards) iac.md — Terraform module changes for the multi-account federation bundle security.md — credential storage, audit log, scoped tokens tasks.md — phased implementation plan (one atomic commit per task, ordered by dependency) No implementation has shipped yet — the live scheduler still collects per-provider, not per-account-per-provider. The recent work in 32f9e4ffc (history pending), b44283746 (failed/expired states), and the Azure scheduler dedupe (`b3fe6e225`) was scoped to keep the existing single-account model working; this spec is the substrate for the next phase. Doc-only commit. No code or test changes. --- specs/multi-account-execution/README.md | 61 +++ specs/multi-account-execution/acceptance.md | 359 ++++++++++++ specs/multi-account-execution/api.md | 403 ++++++++++++++ specs/multi-account-execution/backend.md | 365 +++++++++++++ specs/multi-account-execution/data-model.md | 329 +++++++++++ specs/multi-account-execution/frontend.md | 571 ++++++++++++++++++++ specs/multi-account-execution/iac.md | 322 +++++++++++ specs/multi-account-execution/security.md | 188 +++++++ specs/multi-account-execution/tasks.md | 432 +++++++++++++++ 9 files changed, 3030 insertions(+) create mode 100644 specs/multi-account-execution/README.md create mode 100644 specs/multi-account-execution/acceptance.md create mode 100644 specs/multi-account-execution/api.md create mode 100644 specs/multi-account-execution/backend.md create mode 100644 specs/multi-account-execution/data-model.md create mode 100644 specs/multi-account-execution/frontend.md create mode 100644 specs/multi-account-execution/iac.md create mode 100644 specs/multi-account-execution/security.md create mode 100644 specs/multi-account-execution/tasks.md diff --git a/specs/multi-account-execution/README.md b/specs/multi-account-execution/README.md new file mode 100644 index 00000000..4b5a57b1 --- /dev/null +++ b/specs/multi-account-execution/README.md @@ -0,0 +1,61 @@ + +# Spec: Multi-Account Execution + +**Status**: Draft +**Created**: 2026-04-02 +**Authors**: Engineering + +## Problem Statement + +CUDly currently manages a single AWS account (via IAM role), a single Azure subscription, and a single GCP project. Organizations using CUDly at scale run workloads across dozens or hundreds of cloud accounts. Cost-optimization tooling that cannot see all accounts simultaneously cannot produce accurate recommendations, cannot purchase commitments across an org, and forces users to log in separately to each account. + +This feature enables CUDly to: + +1. Manage any number of cloud accounts/subscriptions per provider +2. Collect recommendations and purchase commitments across all accounts in parallel +3. Display aggregated or per-account data in every view +4. Apply service-level settings globally or override them per account + +## Scope + +All three supported cloud providers are in scope: + +- **AWS**: EC2, RDS, ElastiCache, OpenSearch, Redshift, Savings Plans — across any number of AWS accounts +- **Azure**: VM, SQL, Cosmos DB reservations — across any number of Azure subscriptions +- **GCP**: Compute, Cloud SQL committed use discounts — across any number of GCP projects + +## Document Index + +| Document | Contents | +|----------|----------| +| [data-model.md](./data-model.md) | New DB tables, column additions to existing tables, migration strategy | +| [api.md](./api.md) | New REST endpoints + modifications to existing endpoints | +| [backend.md](./backend.md) | Credential resolution, org discovery, parallel execution engine, service config cascade | +| [frontend.md](./frontend.md) | Settings UI (account CRUD), filter changes, plan account association, state model | +| [security.md](./security.md) | Credential encryption, access control, audit logging, input validation | +| [iac.md](./iac.md) | Terraform changes: Secrets Manager key, Lambda IAM policies, env vars for all environments | +| [acceptance.md](./acceptance.md) | BDD-style acceptance scenarios (the definition of done) | +| [tasks.md](./tasks.md) | Ordered implementation tasks with file paths, dependencies, and test requirements | + +## Key Design Decisions + +| Decision | Choice | Rationale | +|----------|--------|-----------| +| AWS auth modes | `access_keys`, `role_arn`, `bastion` | Covers all real-world patterns: simple dev accounts, cross-account role delegation, AWS Organizations with a central hub | +| Bastion pattern | One hub account (keys or role) assumes roles in N target accounts | Standard AWS Organizations security model; avoids giving CUDly direct keys to every account | +| AWS Org discovery | Add org root/management account → CUDly calls Organizations API to list member accounts | No separate discovery step; the management account IS the discovery mechanism | +| Service override cascade | Sparse: `NULL` fields in account override inherit the global default | Minimises configuration duplication; operators set global sensible defaults and only override what differs per account | +| Plan ↔ Account relationship | Many-to-many via `plan_accounts` join table | One plan can fan out across multiple accounts; one account can participate in multiple plans | +| Parallel execution | Backend goroutine fan-out per account; per-account errors are isolated | Failure in one account does not block others; results collected and tagged with `cloud_account_id` | +| Credential storage | AES-256-GCM encrypted blob in `account_credentials` table; key loaded from AWS Secrets Manager (ARN via `CREDENTIAL_ENCRYPTION_KEY_SECRET_ARN` env) in prod; direct hex env var for local dev | Consistent with existing project secret management pattern; encryption key never in plaintext Lambda env | + +## Glossary + +| Term | Definition | +|------|-----------| +| **Cloud Account** | A single AWS account, Azure subscription, or GCP project managed by CUDly | +| **Bastion Account** | An AWS account whose credentials CUDly holds directly and uses to assume roles in other (target) accounts | +| **Org Root Account** | An AWS management account that has AWS Organizations access; used to discover member accounts | +| **Service Override** | A per-account, per-service config value that overrides the global default | +| **Plan Fan-out** | The process of executing a single purchase plan concurrently across all accounts it targets | +| **Effective Config** | The merged config for (account, provider, service): account overrides applied on top of global defaults | diff --git a/specs/multi-account-execution/acceptance.md b/specs/multi-account-execution/acceptance.md new file mode 100644 index 00000000..2f2c889b --- /dev/null +++ b/specs/multi-account-execution/acceptance.md @@ -0,0 +1,359 @@ + +# Acceptance Criteria + +BDD-style scenarios using Given / When / Then. +Each scenario maps to one or more automated tests or manual verification steps. + +--- + +## A. Account CRUD + +### A-1: Create an AWS account with Role ARN + +**Given** no cloud accounts are configured +**When** I POST `/api/accounts` with: + +```json +{ + "name": "Prod-US", + "provider": "aws", + "external_id": "123456789012", + "aws_auth_mode": "role_arn", + "aws_role_arn": "arn:aws:iam::123456789012:role/CUDly" +} +``` + +**Then** the response is `201` with the created account object +**And** `GET /api/accounts` returns the account in the list +**And** `credentials_configured` is `false` (no credentials provided yet) + +--- + +### A-2: Create account with inline credentials + +**Given** a valid account payload +**When** I include `credentials: { "access_key_id": "...", "secret_access_key": "..." }` in the POST (inline credentials on create; `type` is inferred from `aws_auth_mode = "access_keys"`) +**Then** the account is created AND credentials are stored +**And** `credentials_configured` is `true` +**And** the credentials are NOT returned in any subsequent GET response + +--- + +### A-2a: provider and external_id are immutable + +**Given** account X exists with `provider=aws, external_id=123456789012` +**When** I PUT `/api/accounts/X.id` with `{ "provider": "azure" }` in the body +**Then** response is `400` with an error describing that `provider` is immutable +**And** account X is unchanged + +--- + +### A-3: Duplicate external_id is rejected + +**Given** account with `provider=aws, external_id=123456789012` exists +**When** I POST another account with the same `(provider, external_id)` +**Then** the response is `409 Conflict` + +--- + +### A-4: Bastion chain not allowed + +**Given** account A has `aws_auth_mode = 'bastion'` +**When** I create account B with `aws_auth_mode = 'bastion'` and `aws_bastion_id = A.id` +**Then** the response is `400` with error `"bastion chaining is not supported"` + +--- + +### A-5: Delete account cascades correctly + +**Given** account X exists with service overrides and purchase history +**When** I DELETE `/api/accounts/X.id` +**Then** response is `204` +**And** `account_credentials` rows for X are deleted +**And** `account_service_overrides` rows for X are deleted +**And** `purchase_history` rows for X retain their data but have `cloud_account_id = NULL` + +--- + +### A-6: Delete bastion account blocked by dependents + +**Given** account B is used as bastion by accounts C and D +**When** I DELETE `/api/accounts/B.id` +**Then** the response is `409` with a message listing accounts C and D +**And** account B still exists + +--- + +## B. Credential Management + +### B-1: Credentials are never returned + +**Given** credentials are stored for account X +**When** I GET `/api/accounts/X.id` +**Then** the response body does NOT contain `access_key_id`, `secret_access_key`, `client_secret`, `private_key`, or any credential value +**And** `credentials_configured: true` is present + +--- + +### B-2: Test credentials — success + +**Given** account X has valid credentials stored +**When** I POST `/api/accounts/X.id/test` +**Then** response is `200` with `{ "ok": true, "caller_identity": "..." }` +**And** no credential values appear in the response or server logs + +--- + +### B-3: Test credentials — failure + +**Given** account X has invalid/expired credentials +**When** I POST `/api/accounts/X.id/test` +**Then** response is `200` (not `5xx`) with `{ "ok": false, "error": "..." }` +**And** the error message does NOT contain the credential value + +--- + +### B-3a: Test credentials — no credentials configured + +**Given** account X exists but has no credentials stored +**When** I POST `/api/accounts/X.id/test` +**Then** response is `200` (not `404`) with `{ "ok": false, "error": "no credentials configured" }` + +--- + +### B-4: Overwrite credentials + +**Given** credentials are already stored for account X +**When** I POST `/api/accounts/X.id/credentials` with new credential values +**Then** response is `204` +**And** old encrypted blob is replaced in `account_credentials` +**And** test connectivity returns success with the new credentials + +--- + +## C. Service Overrides + +### C-1: Global default is used when no override exists + +**Given** global service config: `{ provider: "aws", service: "ec2", term: 3, coverage: 80 }` +**And** account X has no override for `aws/ec2` +**When** recommendations are collected for account X +**Then** the effective config used for EC2 is `term=3, coverage=80` + +--- + +### C-2: Account override takes precedence for set fields + +**Given** global: `{ term: 3, coverage: 80 }` +**And** account X has override: `{ term: 1 }` (coverage not set) +**When** recommendations are collected for account X +**Then** the effective config is `term=1, coverage=80` (term overridden, coverage inherited) + +--- + +### C-3: Deleting an override reverts to global + +**Given** account X has override `{ term: 1 }` for `aws/ec2` +**When** I DELETE `/api/accounts/X.id/service-overrides/aws/ec2` +**Then** response is `204` +**And** effective config for account X / EC2 reverts to global default + +--- + +## D. Multi-Account Filtering + +### D-1: Filter recommendations by account + +**Given** accounts A (`123...`) and B (`234...`) have separate recommendations +**When** I GET `/api/recommendations?account_ids=A.id` +**Then** only recommendations from account A are returned +**And** recommendations from account B are absent + +--- + +### D-2: Filter dashboard by multiple accounts + +**Given** accounts A, B, C each have savings data +**When** I GET `/api/dashboard/summary?account_ids=A.id,B.id` +**Then** the summary aggregates A and B only +**And** C's data is excluded + +--- + +### D-3: No account filter returns all accounts + +**Given** accounts A, B, C exist +**When** I GET `/api/recommendations` (no `account_ids` param) +**Then** recommendations from all three accounts are returned + +--- + +### D-4: Filtering by disabled account returns empty + +**Given** account D exists but `enabled = false` +**When** I GET `/api/recommendations` (no filter) +**Then** recommendations from account D are NOT included +(Disabled accounts are excluded from all data collection) + +--- + +## E. Plan Fan-out + +### E-1: Plan targeting two accounts creates two execution records + +**Given** plan P targets accounts A and B +**When** the plan executes +**Then** two rows are inserted in `purchase_executions`: one with `cloud_account_id = A.id`, one with `cloud_account_id = B.id` +**And** both executions proceed in parallel (goroutines; verified via execution timestamps being close) + +--- + +### E-2: One account failure does not fail others + +**Given** plan P targets accounts A (valid credentials) and B (invalid credentials) +**When** the plan executes +**Then** account A's execution completes successfully +**And** account B's execution is marked `failed` with an error message +**And** account A's execution record is unaffected by B's failure + +--- + +### E-3: Plan with empty account list targets all provider accounts + +**Given** plan P has services with keys like `"aws:ec2"` (provider derived at runtime as `aws`) and no entries in `plan_accounts` +**And** three AWS accounts are enabled (A, B, C) +**When** the plan executes +**Then** three execution records are created (one per account) + +--- + +### E-4: Plan filtered to provider — only that provider's accounts run + +**Given** plan P has services with keys like `"aws:ec2"` (provider derived as `aws`) +**And** accounts A (AWS), B (AWS), C (Azure) are enabled +**When** `PUT /api/plans/P.id/accounts` is called with `[A.id, B.id, C.id]` +**Then** the API returns `400` because C is not an AWS account +**And** no accounts are associated + +--- + +## F. Org Discovery + +### F-1: Discovery creates accounts for new members + +**Given** org root account R is configured with valid credentials +**And** Organizations has member accounts M1, M2, M3 +**And** M1 already exists in `cloud_accounts` +**When** I POST `/api/accounts/discover-org` with `org_root_account_id = R.id` +**Then** response is `200` with `{ "discovered": 3, "created": 2, "skipped": 1 }` +**And** M2 and M3 are created with `enabled = false` +**And** M1 is untouched (skipped) + +--- + +### F-2: Non-admin cannot trigger discovery + +**Given** user has role `user` (not `admin`) +**When** I POST `/api/accounts/discover-org` +**Then** response is `403 Forbidden` + +--- + +### F-3: Discovery from non-org-root is rejected + +**Given** account X has `aws_is_org_root = false` +**When** I POST `/api/accounts/discover-org` with `org_root_account_id = X.id` +**Then** response is `400` with a descriptive error + +--- + +## G. Frontend — Settings UI + +### G-1: Account appears in list after creation + +**Given** the Settings tab is open, AWS provider enabled +**When** I click "+ Add Account" and fill in valid details +**And** I click "Save Account" +**Then** the new account appears in the `#aws-account-list` without page reload +**And** the row shows the account name, Account ID, and auth mode badge + +--- + +### G-2: Credentials status updates after saving credentials + +**Given** account X shows "⚠ No credentials" +**When** I click "Credentials" and save valid credentials +**Then** the row updates to "✓ Credentials set" + +--- + +### G-3: Test button shows inline success/error + +**Given** account X has credentials configured +**When** I click "Test" +**Then** a non-blocking notification appears: "✓ Connected as arn:aws:iam::..." or "✗ Test failed: [reason]" +**And** the page does not navigate away + +--- + +### G-4: Overrides panel expands inline + +**Given** account X is shown in the account list +**When** I click "Overrides" +**Then** a panel expands below the account row showing service rows +**And** rows with active overrides show "(override)" badge +**And** rows without overrides show "(global)" badge + +--- + +## H. Frontend — Filtering + +### H-1: Account filter is populated on tab load + +**Given** three AWS accounts are configured +**When** I navigate to the Recommendations tab +**Then** the `#recommendations-account-filter` select contains "All Accounts" plus the three account names + +--- + +### H-2: Provider filter change updates account list + +**Given** the provider filter is set to "All" +**And** the account filter shows AWS + Azure accounts +**When** I change the provider filter to "AWS" +**Then** the account filter is repopulated with AWS accounts only + +--- + +### H-3: Multi-account selection is passed to API + +**Given** accounts A and B are selected in the account filter +**When** recommendations reload +**Then** the API request includes `account_ids=A.id,B.id` + +--- + +## I. Backward Compatibility + +### I-1: Existing purchase history is unaffected + +**Given** purchase history records exist with `cloud_account_id = NULL` (pre-migration) +**When** I GET `/api/history` with no filters +**Then** old records are included in the response +**And** their `cloud_account_id` field is absent or null + +--- + +### I-2: Existing config endpoints still work + +**Given** multi-account feature is deployed +**When** I GET `/api/config` +**Then** the response is identical to pre-deployment (no breaking changes to existing config shape) + +--- + +### I-3: Single-account workflow unchanged + +**Given** only one AWS account is configured +**When** using any existing feature (recommendations, history, plans) +**Then** the behavior is identical to pre-deployment with no account filter applied diff --git a/specs/multi-account-execution/api.md b/specs/multi-account-execution/api.md new file mode 100644 index 00000000..29793311 --- /dev/null +++ b/specs/multi-account-execution/api.md @@ -0,0 +1,403 @@ + +# API Design + +All new endpoints are under the existing `/api` prefix and follow the same auth middleware, error response format, and permission model as current endpoints. + +--- + +## Cloud Accounts Endpoints + +### `GET /api/accounts` + +List all configured cloud accounts. Returns metadata only — no credential material. + +**Query params:** + +| Param | Type | Description | +|-------|------|-------------| +| `provider` | string | Filter by provider: `aws`, `azure`, `gcp` | +| `enabled` | bool | Filter by enabled status | +| `search` | string | Substring match on `name` or `external_id` | + +**Response `200`:** + +```json +{ + "accounts": [ + { + "id": "uuid", + "name": "Prod-US", + "description": "Production US account", + "contact_email": "infra@example.com", + "enabled": true, + "provider": "aws", + "external_id": "123456789012", + "aws_auth_mode": "role_arn", + "aws_role_arn": "arn:aws:iam::123456789012:role/CUDly", + "aws_is_org_root": false, + "credentials_configured": true, + "created_at": "2026-01-01T00:00:00Z", + "updated_at": "2026-01-01T00:00:00Z" + } + ] +} +``` + +--- + +### `POST /api/accounts` + +Create a new cloud account. Credentials can be included inline and are stored encrypted. + +**Request body:** + +```json +{ + "name": "Prod-US", + "description": "Production US account", + "contact_email": "infra@example.com", + "enabled": true, + "provider": "aws", + "external_id": "123456789012", + "aws_auth_mode": "role_arn", + "aws_role_arn": "arn:aws:iam::123456789012:role/CUDly", + "aws_external_id": "cudly-external-123", + + "credentials": { + "access_key_id": "AKIA...", // only for aws_auth_mode = access_keys + "secret_access_key": "..." // only for aws_auth_mode = access_keys + } +} +``` + +For Azure: + +```json +{ + "provider": "azure", + "name": "Azure Prod", + "external_id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx", + "azure_subscription_id": "xxxxxxxx-...", + "azure_tenant_id": "yyyyyyyy-...", + "azure_client_id": "zzzzzzzz-...", + "credentials": { + "client_secret": "..." + } +} +``` + +For GCP: + +```json +{ + "provider": "gcp", + "name": "GCP Analytics", + "external_id": "my-project-123", + "gcp_project_id": "my-project-123", + "credentials": { + "service_account_json": "{ \"type\": \"service_account\", ... }" + } +} +``` + +**Response `201`:** Created account object (same shape as GET, no credential material). + +**Validation errors `400`:** + +- `provider` not in `{aws, azure, gcp}` +- `aws_auth_mode = role_arn` but no `aws_role_arn` provided +- `aws_auth_mode = bastion` but no `aws_bastion_id` or bastion account not found +- Duplicate `(provider, external_id)` → `409 Conflict` + +--- + +### `GET /api/accounts/:id` + +Get a single account. Never returns credential material. + +**Response `200`:** Same shape as list item. +**Response `404`:** Account not found. + +--- + +### `PUT /api/accounts/:id` + +Update account metadata. Does not touch credentials. + +**Request body:** Same fields as POST, minus `credentials`, `provider`, and `external_id`. `provider` and `external_id` are immutable after creation; the server returns `400` if either is included in the PUT body. +**Response `200`:** Updated account. + +--- + +### `DELETE /api/accounts/:id` + +Delete account and cascade-delete its credentials and service overrides. Purchase history records retain the account name as a string but FK becomes `NULL`. + +**Response `204`:** No content. +**Response `409`:** Account is referenced as bastion by other accounts (caller must re-assign those first). + +On delete, CUDly applies `ON DELETE CASCADE` to `account_credentials` and `account_service_overrides`, and `ON DELETE SET NULL` to `cloud_account_id` in `purchase_history`, `purchase_executions`, `savings_snapshots`, and `ri_exchange_history`. The `account_id VARCHAR(20)` column (cloud provider's raw account ID) is retained unchanged in those tables. + +--- + +### `POST /api/accounts/:id/credentials` + +Replace credentials for an account. Write-only — returns no credential data. + +**Request body:** + +AWS access keys: + +```json +{ + "type": "aws_access_keys", + "access_key_id": "AKIA...", + "secret_access_key": "..." +} +``` + +Azure client secret: + +```json +{ + "type": "azure_client_secret", + "client_secret": "..." +} +``` + +GCP service account: + +```json +{ + "type": "gcp_service_account", + "service_account_json": "{ \"type\": \"service_account\", ... }" +} +``` + +**Response `204`:** No content. + +--- + +### `POST /api/accounts/:id/test` + +Test connectivity using the account's stored credentials. Makes a minimal read-only API call: + +- AWS: `sts:GetCallerIdentity` +- Azure: `GET /subscriptions/{id}` (read subscription metadata) +- GCP: `resourcemanager.projects.get` + +**Response `200`:** + +```json +{ + "ok": true, + "caller_identity": "arn:aws:iam::123456789012:role/CUDly" +} +``` + +**Response `200`** (failure — connection error is not an HTTP error): + +```json +{ + "ok": false, + "error": "NoCredentialProviders: no valid providers in chain" +} +``` + +**Response `200`** (no credentials stored — not a 404; credentials absence is a soft failure): + +```json +{ "ok": false, "error": "no credentials configured" } +``` + +**Response `404`:** Account not found. + +--- + +### `GET /api/accounts/:id/service-overrides` + +List all service overrides for an account. + +**Response `200`:** + +```json +{ + "overrides": [ + { + "id": "uuid", + "account_id": "uuid", + "provider": "aws", + "service": "ec2", + "term": 1, + "coverage": 60.0 + } + ] +} +``` + +Fields not overridden are omitted from the response (callers should merge with global defaults client-side or use the resolved-config endpoint). + +--- + +### `PUT /api/accounts/:id/service-overrides/:provider/:service` + +Create or replace a service override for an account. This is a **full-replace** operation: all override columns are set to the values provided in the request body; any field absent from the body is written as `NULL` (meaning "inherit from global default"). To update one field without resetting others, the caller must include the full desired override in the body. + +**Request body (all fields optional):** + +```json +{ + "enabled": true, + "term": 1, + "payment": "all-upfront", + "coverage": 60.0, + "ramp_schedule": "immediate", + "include_regions": ["us-east-1", "eu-west-1"], + "exclude_engines": ["aurora-mysql"] +} +``` + +**Response `200`:** Saved override (sparse — only explicitly-set fields returned). + +--- + +### `DELETE /api/accounts/:id/service-overrides/:provider/:service` + +Delete override, reverting to global default. + +**Response `204`:** No content. + +--- + +### `POST /api/accounts/discover-org` + +Trigger AWS Organizations member account discovery using a configured org-root account. Idempotent: existing accounts matching discovered external IDs are skipped; new ones are created with `enabled = false` pending user review. + +**Request body:** + +```json +{ + "org_root_account_id": "uuid" +} +``` + +**Response `200`:** + +```json +{ + "discovered": 14, + "created": 3, + "skipped": 11, + "accounts": [ + { "name": "Member-Acct-A", "external_id": "111122223333", "created": true }, + ... + ] +} +``` + +**Response `400`:** `aws_is_org_root = false` for the referenced account (account exists but is not an org root). +**Response `404`:** `org_root_account_id` references an account that does not exist. +**Response `403`:** Caller is not an admin. + +--- + +## Plan ↔ Account Association + +### `GET /api/plans/:id/accounts` + +List accounts associated with a plan. + +**Required permission:** `view` on `accounts`. + +**Response `200`:** + +```json +{ + "account_ids": ["uuid1", "uuid2"], + "accounts": [ + { "id": "uuid1", "name": "Prod-US", "provider": "aws", "external_id": "123..." } + ] +} +``` + +--- + +### `PUT /api/plans/:id/accounts` + +Replace the full account list for a plan. Send an empty array to target all accounts for the plan's provider. + +**Required permission:** `update` on `accounts`. + +**Request body:** + +```json +{ + "account_ids": ["uuid1", "uuid2"] +} +``` + +**Validation:** All referenced `account_ids` must exist and their `provider` must match the plan's provider (derived from the plan's `services` map keys — e.g. a key of `"aws:ec2"` means provider `aws`). +**Response `200`:** Same as GET response above. + +--- + +## Modifications to Existing Endpoints + +The following endpoints gain an optional `account_ids` query parameter. + +**`account_ids`**: comma-separated list of `cloud_accounts.id` UUIDs. Omitting the param (or passing `account_ids=`) returns data for **all** accounts. + +### Dashboard + +``` +GET /api/dashboard/summary?provider=aws&account_ids=uuid1,uuid2 +GET /api/dashboard/upcoming?account_ids=uuid1 +``` + +### Recommendations + +``` +GET /api/recommendations?provider=aws&account_ids=uuid1&service=ec2®ion=us-east-1 +``` + +### History + +``` +GET /api/history?provider=aws&account_ids=uuid1,uuid2&start=2026-01-01&end=2026-03-31 +GET /api/history/analytics?account_ids=uuid1&interval=monthly +GET /api/history/breakdown?dimension=service&account_ids=uuid1 +``` + +--- + +## Error Response Format + +Unchanged from current pattern — a single `error` string key, no code field: + +```json +{ + "error": "account not found" +} +``` + +Standard HTTP status codes apply: `400` validation, `401` unauthenticated, `403` unauthorized, `404` not found, `409` conflict, `500` internal. + +Error types are signalled via `NewClientError(code, message)` (defined in `internal/api/handler_router.go`). For example, `NewClientError(404, "account not found")` produces `{"error": "account not found"}` with HTTP 404. Any unhandled Go error maps to `500 {"error": "Internal server error"}`. + +--- + +## Permissions + +New resource type `accounts` added to the permission model: + +| Action | Resource | Description | +|--------|----------|-------------| +| `view` | `accounts` | List and GET accounts | +| `create` | `accounts` | Create new accounts | +| `update` | `accounts` | Edit metadata and service overrides | +| `delete` | `accounts` | Delete accounts | +| `manage_credentials` | `accounts` | Write credentials to an account | +| `test_credentials` | `accounts` | Test account connectivity | +| `discover_org` | `accounts` | Trigger org discovery (admin only) | + +The `allowed_accounts` constraint on existing permissions continues to work: a user with `view` on `recommendations` and `allowed_accounts: ["uuid1"]` only sees recommendations from account `uuid1`. diff --git a/specs/multi-account-execution/backend.md b/specs/multi-account-execution/backend.md new file mode 100644 index 00000000..6cc4dba9 --- /dev/null +++ b/specs/multi-account-execution/backend.md @@ -0,0 +1,365 @@ + +# Backend Architecture + +## Package Structure + +``` +internal/ + credentials/ ← NEW: credential encryption + resolution + cipher.go ← AES-256-GCM encrypt/decrypt primitives + store.go ← save/load encrypted blobs from account_credentials + resolver.go ← decrypt + return provider-specific credential structs + cipher_test.go + resolver_test.go + + accounts/ ← NEW: account-level business logic + org_discovery.go ← AWS Organizations member-account discovery + org_discovery_test.go + + execution/ ← NEW: parallel multi-account fan-out engine + executor.go ← Executor interface + per-account executor + fanout.go ← FanOut: concurrently runs a function across N accounts + collector.go ← aggregates tagged per-account results + fanout_test.go + + config/ + types.go ← ADD: CloudAccount, AccountServiceOverride, CloudAccountFilter + interfaces.go ← ADD: new store method signatures + store_postgres.go ← ADD: implementations for new StoreInterface methods + NOTE: does NOT implement credentials.CredentialStore — + that would create a circular import (credentials→config→credentials) + resolver.go ← ADD: ResolveServiceConfig (cascade logic) + + api/ + handler_accounts.go ← NEW: HTTP handlers for /api/accounts/* routes + handler_accounts_test.go + handler.go ← MODIFY: add credStore credentials.CredentialStore to Handler struct + types.go ← MODIFY: add CredentialStore to HandlerConfig + router.go ← MODIFY: register new routes (existing table-driven router) + handler_recommendations.go ← MODIFY: add account_ids filter to recommendations handler + handler_dashboard.go ← MODIFY: add account_ids filter + handler_history.go ← MODIFY: add account_ids filter to history handler + handler_analytics.go ← MODIFY: add account_ids filter to analytics handlers + -- NOTE: internal/server/handler.go is for SCHEDULED TASK dispatch only, + -- not HTTP routing. HTTP routing lives entirely in internal/api/. + + server/ + app.go ← MODIFY: add CredentialStore field to Application, initialise from + credentials.NewCredentialStore(pool, encKey) in DB connect path +``` + +--- + +## Credential Management (`internal/credentials/`) + +### `cipher.go` + +```go +package credentials + +// Encrypt encrypts plaintext using AES-256-GCM. +// Returns ".". +func Encrypt(key []byte, plaintext []byte) (string, error) + +// Decrypt reverses Encrypt. +func Decrypt(key []byte, blob string) ([]byte, error) + +// KeyFromEnv loads the 32-byte AES encryption key using the following priority order: +// 1. CREDENTIAL_ENCRYPTION_KEY_SECRET_ARN set → call secretsmanager.GetSecretValue(arn), +// treat returned string as 64-char hex key; cache result for Lambda lifetime. +// 2. CREDENTIAL_ENCRYPTION_KEY set → decode as 64-char hex key directly (local dev / non-AWS). +// 3. Neither set → use hardcoded dev-only key; logs WARN "using insecure dev credential key". +func KeyFromEnv() ([]byte, error) +``` + +Key format: `CREDENTIAL_ENCRYPTION_KEY` env var must be a 64-character hex string (32 bytes). Generate with: `openssl rand -hex 32`. + +### `store.go` + +**Import note**: `internal/credentials` imports `internal/config` (for `CloudAccount`). It must NOT be imported by `internal/config` — that would create a circular dependency. The concrete `CredentialStore` implementation lives here in `internal/credentials/store.go` and takes a `*pgxpool.Pool` directly (same mechanism `store_postgres.go` uses), keeping the packages independent. + +```go +type CredentialStore interface { + // SaveCredential encrypts and stores credential material. + SaveCredential(ctx context.Context, accountID string, credType string, payload []byte) error + + // DeleteCredential removes stored credential material. + DeleteCredential(ctx context.Context, accountID string, credType string) error + + // HasCredential returns true if credential material exists for the account+type. + HasCredential(ctx context.Context, accountID string, credType string) (bool, error) + + // LoadRaw decrypts and returns raw credential bytes. + // Only called by resolver.go — not for application use. + // Must be exported (uppercase) since the concrete struct is in the same package + // (pgCredentialStore) but callers across packages need the interface. + LoadRaw(ctx context.Context, accountID string, credType string) ([]byte, error) +} + +// NewCredentialStore creates a CredentialStore backed by the account_credentials table. +// Takes a *pgxpool.Pool directly — does not go through StoreInterface. +func NewCredentialStore(pool *pgxpool.Pool, encKey []byte) CredentialStore +``` + +### `resolver.go` + +```go +// AWSCredentials holds resolved AWS access credentials. +// Note: for role-based auth modes (role_arn, bastion) this struct is not used; +// callers use ResolveAWSCredentialProvider which returns an aws.CredentialsProvider. +type AWSCredentials struct { + AccessKeyID string + SecretAccessKey string +} + +func (c *AWSCredentials) String() string { return "[REDACTED]" } + +// AzureCredentials holds resolved Azure service principal credentials. +type AzureCredentials struct { + ClientSecret string +} + +func (c *AzureCredentials) String() string { return "[REDACTED]" } + +// STSClient is the minimal STS interface needed for role assumption. +// Satisfied by *sts.Client from github.com/aws/aws-sdk-go-v2/service/sts. +type STSClient interface { + AssumeRole(ctx context.Context, params *sts.AssumeRoleInput, optFns ...func(*sts.Options)) (*sts.AssumeRoleOutput, error) +} + +// ResolveAWSCredentialProvider returns an aws.CredentialsProvider for the given account. +// +// Logic per auth mode: +// access_keys: decrypt blob → static credentials provider +// role_arn: use ambient credentials (instance role/env) → STS AssumeRole +// bastion: resolve bastion account creds first → STS AssumeRole into target account +// +// The returned provider is safe to use for the duration of one API call. +// Callers should NOT cache the returned provider across requests. +func ResolveAWSCredentialProvider( + ctx context.Context, + account *config.CloudAccount, + store CredentialStore, + stsClient STSClient, +) (aws.CredentialsProvider, error) + +// ResolveAzureCredentials decrypts and returns the Azure client secret. +func ResolveAzureCredentials(ctx context.Context, account *config.CloudAccount, store CredentialStore) (*AzureCredentials, error) + +// ResolveGCPCredentials decrypts and returns the GCP service account JSON. +func ResolveGCPCredentials(ctx context.Context, account *config.CloudAccount, store CredentialStore) ([]byte, error) +``` + +**Bastion chain resolution (AWS)**: + +1. Load the hub account (`cloud_accounts.aws_bastion_id`). +2. Resolve hub account's credentials (either access_keys or role_arn — bastion accounts cannot themselves be bastions). +3. Use hub credentials to call `sts:AssumeRole` with `aws_role_arn` of the target account and optional `aws_external_id`. +4. Return a `aws.CredentialsProvider` wrapping the assumed-role session token. + +--- + +## Org Discovery (`internal/accounts/org_discovery.go`) + +```go +// DiscoverOrgAccounts calls AWS Organizations ListAccounts using the management +// account's credentials and returns a slice of CloudAccount records (not yet persisted). +// Caller is responsible for deduplication and saving. +func DiscoverOrgAccounts( + ctx context.Context, + mgmtAccount *config.CloudAccount, + credStore credentials.CredentialStore, +) ([]config.CloudAccount, error) +``` + +Implementation: + +1. Resolve credentials for `mgmtAccount` using the credentials package. +2. Create an `organizations.Client` with those credentials. +3. Call `organizations.ListAccounts` (paginated). +4. Map each member account to a `CloudAccount` struct with: + - `Provider = "aws"` + - `ExternalID = member.Id` + - `Name = *member.Name` + - `Enabled = false` (operator must review and enable) + - `AWSAuthMode = "bastion"` (default: assume they'll use the org root as bastion) + - `AWSBastionID = mgmtAccount.ID` +5. Return slice — caller saves to DB with `CreateCloudAccount`, skipping duplicates by `(provider, external_id)`. + +--- + +## Parallel Execution Engine (`internal/execution/`) + +### `executor.go` + +```go +// AccountContext bundles an account with its resolved credentials for one execution. +type AccountContext struct { + Account config.CloudAccount + AWSProvider aws.CredentialsProvider // non-nil for AWS accounts + AzureCreds *credentials.AzureCredentials // non-nil for Azure + GCPKeyJSON []byte // non-nil for GCP +} + +// PrepareAccountContext resolves credentials and returns an AccountContext. +// This is the only function that calls the credentials package. +func PrepareAccountContext(ctx context.Context, account config.CloudAccount, credStore credentials.CredentialStore) (*AccountContext, error) +``` + +### `fanout.go` + +```go +// FanOutResult wraps the result of one account's execution with its account context. +type FanOutResult[T any] struct { + Account config.CloudAccount + Result T + Err error +} + +// FanOut runs fn concurrently for each account with a configurable parallelism limit +// (default: 10). Results are collected in a slice ordered by account ID (deterministic +// for testing — sort by account ID after all goroutines complete). +// +// Implementation note: do NOT use errgroup.WithContext — that cancels the shared context +// on the first error, which would abort in-flight goroutines for other accounts. +// Instead use sync.WaitGroup + a buffered semaphore channel for the parallelism cap, +// and capture each account's error in its own FanOutResult.Err. +// +// A failure in one account's fn does NOT affect others. +// Each error is captured in FanOutResult.Err. +func FanOut[T any]( + ctx context.Context, + accounts []config.CloudAccount, + maxParallel int, + fn func(context.Context, AccountContext) (T, error), + credStore credentials.CredentialStore, +) []FanOutResult[T] +``` + +**Parallelism limit**: Controlled by `CUDLY_MAX_ACCOUNT_PARALLELISM` env var, default `10`. This prevents overwhelming AWS API rate limits when running across many accounts simultaneously. + +### `collector.go` + +```go +// AggregatedRecommendations merges per-account recommendation slices into one slice, +// adding Account metadata to each item. +func AggregatedRecommendations(results []FanOutResult[[]config.RecommendationRecord]) []config.RecommendationRecord + +// AggregatedPurchaseHistory merges per-account history slices. +func AggregatedPurchaseHistory(results []FanOutResult[[]config.PurchaseHistoryRecord]) []config.PurchaseHistoryRecord +``` + +--- + +## Service Config Cascade (`internal/config/resolver.go`) + +```go +// ResolveServiceConfig merges an account-level sparse override on top of the +// global service config. Any nil pointer in override inherits from global. +// Returns a fully-populated ServiceConfig (no nil fields). +func ResolveServiceConfig(global *ServiceConfig, override *AccountServiceOverride) *ServiceConfig { + if override == nil { + return global + } + result := *global // copy + if override.Enabled != nil { result.Enabled = *override.Enabled } + if override.Term != nil { result.Term = *override.Term } + if override.Payment != nil { result.Payment = *override.Payment } + if override.Coverage != nil { result.Coverage = *override.Coverage } + if override.RampSchedule != nil { result.RampSchedule = *override.RampSchedule } + if override.IncludeEngines != nil { result.IncludeEngines = override.IncludeEngines } + if override.ExcludeEngines != nil { result.ExcludeEngines = override.ExcludeEngines } + if override.IncludeRegions != nil { result.IncludeRegions = override.IncludeRegions } + if override.ExcludeRegions != nil { result.ExcludeRegions = override.ExcludeRegions } + if override.IncludeTypes != nil { result.IncludeTypes = override.IncludeTypes } + if override.ExcludeTypes != nil { result.ExcludeTypes = override.ExcludeTypes } + return &result +} +``` + +--- + +## Store Interface Additions (`internal/config/interfaces.go`) + +```go +// Cloud Account management +CreateCloudAccount(ctx context.Context, account *CloudAccount) error +GetCloudAccount(ctx context.Context, id string) (*CloudAccount, error) +UpdateCloudAccount(ctx context.Context, account *CloudAccount) error +DeleteCloudAccount(ctx context.Context, id string) error +ListCloudAccounts(ctx context.Context, filter CloudAccountFilter) ([]CloudAccount, error) + +// Account service overrides +GetAccountServiceOverride(ctx context.Context, accountID, provider, service string) (*AccountServiceOverride, error) +SaveAccountServiceOverride(ctx context.Context, override *AccountServiceOverride) error +DeleteAccountServiceOverride(ctx context.Context, accountID, provider, service string) error +ListAccountServiceOverrides(ctx context.Context, accountID string) ([]AccountServiceOverride, error) + +// Plan ↔ Account association +SetPlanAccounts(ctx context.Context, planID string, accountIDs []string) error +GetPlanAccounts(ctx context.Context, planID string) ([]CloudAccount, error) +``` + +--- + +## Purchase Plan Execution Changes + +Current flow is triggered via `app.Purchase.ProcessScheduledPurchases(ctx)` (called from +`internal/server/handler.go:handleProcessScheduledPurchases`). The implementation lives in: + +- `internal/purchase/manager.go` — `ProcessScheduledPurchases` orchestrates the flow +- `internal/purchase/execution.go` — per-plan execution logic + +New flow — changes required in `internal/purchase/execution.go`: + +``` +ProcessScheduledPurchases(ctx): + for each plan: + 1. Derive plan's provider: iterate plan.Services map keys (format "provider:service", + e.g. "aws:ec2") and extract the provider. In practice the frontend always sends a + single provider, so take the first distinct provider found. + 2. Load plan accounts from plan_accounts (if empty: load all enabled accounts + matching that derived provider) + 3. For each account: PrepareAccountContext(ctx, account, credStore) + 4. FanOut(ctx, accountContexts, fn=executeForAccount) + 5. For each FanOutResult: + a. Create purchase_execution row tagged with cloud_account_id + b. If Err != nil: mark execution failed, log, continue + c. If ok: execute purchases against account's cloud API + 6. Aggregate and return summary +``` + +**Note on `PurchasePlan.Provider`**: The `PurchasePlan` struct (`internal/config/types.go`) has no direct `provider` field. Provider is embedded in the `Services` map keys (e.g. `"aws:ec2"`). Do not add a `provider` column to `purchase_plans` as part of this feature; derive it from the services map at runtime. + +The `purchase_executions` table now has a `cloud_account_id` column so each execution row is linked to the specific account it ran against. + +--- + +## Modified Handler: `account_ids` Filtering + +The existing handlers in `internal/api/` use a Lambda Function URL request model, not +`*http.Request`. Query parameter extraction uses the Lambda event: + +```go +// parseAccountIDs extracts and validates the account_ids query param from a Lambda request. +// Returns nil slice (= all accounts) if param is absent or empty. +// Add this helper to internal/api/handler.go or a shared helpers file. +func parseAccountIDs(req *events.LambdaFunctionURLRequest) ([]string, error) { + raw := req.QueryStringParameters["account_ids"] + if raw == "" { + return nil, nil + } + parts := strings.Split(raw, ",") + ids := make([]string, 0, len(parts)) + for _, part := range parts { + id := strings.TrimSpace(part) + if _, err := uuid.Parse(id); err != nil { + return nil, fmt.Errorf("invalid account_id %q: %w", id, err) + } + ids = append(ids, id) + } + return ids, nil +} +``` + +History queries that currently filter by `account_id VARCHAR(20)` (raw AWS account ID) will also support filtering by `cloud_account_id UUID` for multi-account scenarios. The existing single-account behavior is preserved when `account_ids` is absent. diff --git a/specs/multi-account-execution/data-model.md b/specs/multi-account-execution/data-model.md new file mode 100644 index 00000000..e19438c8 --- /dev/null +++ b/specs/multi-account-execution/data-model.md @@ -0,0 +1,329 @@ + +# Data Model + +## Overview + +Four new tables are introduced. Four existing tables gain a nullable FK column (`cloud_account_id`) for forward-compatible account attribution. All changes are backward-compatible: existing rows keep working with `cloud_account_id = NULL` (interpreted as "legacy / pre-migration record"). + +Migration file: `internal/database/postgres/migrations/000011_cloud_accounts.up.sql` + +--- + +## New Tables + +### `cloud_accounts` + +Central registry for every managed cloud account/subscription/project. + +```sql +CREATE TABLE cloud_accounts ( + id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), + + -- Display / metadata + name VARCHAR(255) NOT NULL, + description TEXT, + contact_email VARCHAR(255), + enabled BOOLEAN NOT NULL DEFAULT true, + + -- Provider identification + provider VARCHAR(32) NOT NULL + CHECK (provider IN ('aws', 'azure', 'gcp')), + external_id VARCHAR(255) NOT NULL, + -- external_id meaning per provider: + -- aws: 12-digit AWS account ID (e.g. "123456789012") + -- azure: Azure subscription UUID (e.g. "xxxxxxxx-xxxx-...") + -- gcp: GCP project ID (e.g. "my-project-123") + + -- ── AWS-specific ───────────────────────────────────────── + aws_auth_mode VARCHAR(32) + CHECK (aws_auth_mode IN ('access_keys', 'role_arn', 'bastion')), + aws_role_arn VARCHAR(512), -- for role_arn and bastion modes + aws_external_id VARCHAR(255), -- ExternalId for STS AssumeRole (optional) + aws_bastion_id UUID -- for bastion mode: FK to the hub account + REFERENCES cloud_accounts(id) ON DELETE SET NULL, + aws_is_org_root BOOLEAN NOT NULL DEFAULT false, + -- When true: CUDly treats this as an AWS Organizations management account + -- and can discover member accounts via the Organizations API. + + -- ── Azure-specific ─────────────────────────────────────── + azure_subscription_id VARCHAR(36), + azure_tenant_id VARCHAR(36), + azure_client_id VARCHAR(36), -- non-secret; client_secret stored in account_credentials + + -- ── GCP-specific ───────────────────────────────────────── + gcp_project_id VARCHAR(255), + gcp_client_email VARCHAR(255), -- non-secret; private_key stored in account_credentials + + -- Audit + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + created_by UUID REFERENCES users(id) ON DELETE SET NULL, + + UNIQUE(provider, external_id) +); + +CREATE TRIGGER update_cloud_accounts_updated_at + BEFORE UPDATE ON cloud_accounts + FOR EACH ROW EXECUTE FUNCTION update_updated_at_column(); + +CREATE INDEX idx_cloud_accounts_provider ON cloud_accounts(provider) WHERE enabled = true; +CREATE INDEX idx_cloud_accounts_org_root ON cloud_accounts(aws_is_org_root) WHERE aws_is_org_root = true; +``` + +**Constraints:** + +- `aws_bastion_id` must not reference an account that itself has `aws_auth_mode = 'bastion'` — enforced at application layer (prevents chaining). +- When `aws_auth_mode = 'role_arn'` or `'bastion'`, `aws_role_arn` must be non-null — validated on write. +- When `provider = 'azure'`, `azure_subscription_id` and `azure_tenant_id` must be non-null — validated on write. + +**`external_id` vs provider-specific ID fields:** +`external_id` is the provider-agnostic unique key used for deduplication (`UNIQUE(provider, external_id)`). For Azure and GCP, `external_id` duplicates the provider-specific field: + +- Azure: `external_id` = `azure_subscription_id` (both are the subscription UUID). On create/update, if both are provided they must match; if only `external_id` is provided, the handler sets `azure_subscription_id = external_id` automatically. +- GCP: `external_id` = `gcp_project_id` (both are the project ID string). Same derivation rule applies. +- AWS: `external_id` is the 12-digit AWS account ID; there is no separate `aws_account_id` column (not needed — the account ID is not provider-specific secret data). + +--- + +### `account_credentials` + +Stores encrypted credential material. Never returned via GET responses. + +```sql +CREATE TABLE account_credentials ( + id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), + account_id UUID NOT NULL + REFERENCES cloud_accounts(id) ON DELETE CASCADE, + credential_type VARCHAR(32) NOT NULL, + -- Allowed values and the JSON structure they encrypt: + -- + -- 'aws_access_keys' → {"access_key_id": "...", "secret_access_key": "..."} + -- 'azure_client_secret' → {"client_secret": "..."} + -- 'gcp_service_account' → full service account JSON blob + -- + encrypted_blob TEXT NOT NULL, + -- AES-256-GCM ciphertext, base64url-encoded. + -- Format: . + -- Key source: loaded by KeyFromEnv() — see iac.md and security.md for full load order + -- (prod: CREDENTIAL_ENCRYPTION_KEY_SECRET_ARN → Secrets Manager; + -- dev: CREDENTIAL_ENCRYPTION_KEY env var directly) + + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + + UNIQUE(account_id, credential_type) +); + +CREATE TRIGGER update_account_credentials_updated_at + BEFORE UPDATE ON account_credentials + FOR EACH ROW EXECUTE FUNCTION update_updated_at_column(); +``` + +**Note**: `account_credentials` has no SELECT-returning API endpoint. The application only reads it server-side to resolve credentials before making cloud API calls. + +--- + +### `account_service_overrides` + +Per-account sparse overrides for service configuration. `NULL` in any column means "inherit from global `service_configs`". + +```sql +CREATE TABLE account_service_overrides ( + id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), + account_id UUID NOT NULL + REFERENCES cloud_accounts(id) ON DELETE CASCADE, + provider VARCHAR(32) NOT NULL, + service VARCHAR(64) NOT NULL, + + -- All override fields are nullable; NULL = inherit global default + enabled BOOLEAN, + term INTEGER, -- years: 1 or 3 + payment VARCHAR(32), -- no-upfront / partial-upfront / all-upfront + coverage DECIMAL(5,2), -- 0–100 + ramp_schedule VARCHAR(32), + include_engines TEXT[], + exclude_engines TEXT[], + include_regions TEXT[], + exclude_regions TEXT[], + include_types TEXT[], + exclude_types TEXT[], + + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + + UNIQUE(account_id, provider, service) +); + +CREATE TRIGGER update_account_service_overrides_updated_at + BEFORE UPDATE ON account_service_overrides + FOR EACH ROW EXECUTE FUNCTION update_updated_at_column(); + +CREATE INDEX idx_aso_account ON account_service_overrides(account_id); +``` + +**Cascade resolution logic** (implemented in `internal/config/resolver.go`): + +``` +effectiveConfig(account, provider, service): + global = service_configs WHERE provider=? AND service=? + override = account_service_overrides WHERE account_id=? AND provider=? AND service=? + IF override IS NULL: return global + RETURN { + enabled: COALESCE(override.enabled, global.enabled), + term: COALESCE(override.term, global.term), + payment: COALESCE(override.payment, global.payment), + coverage: COALESCE(override.coverage, global.coverage), + ramp_schedule: COALESCE(override.ramp_schedule, global.ramp_schedule), + include_engines: COALESCE(override.include_engines, global.include_engines), + ... etc + } +``` + +--- + +### `plan_accounts` + +Many-to-many join: which cloud accounts a purchase plan targets. + +```sql +CREATE TABLE plan_accounts ( + plan_id UUID NOT NULL REFERENCES purchase_plans(id) ON DELETE CASCADE, + account_id UUID NOT NULL REFERENCES cloud_accounts(id) ON DELETE CASCADE, + PRIMARY KEY (plan_id, account_id) +); + +CREATE INDEX idx_plan_accounts_account ON plan_accounts(account_id); +``` + +**Behaviour when the list is empty**: A plan with no rows in `plan_accounts` targets **all enabled accounts** for its configured provider. This is the default for backward compatibility with plans created before multi-account support. + +--- + +## Existing Table Modifications + +The following tables gain a nullable `cloud_account_id` column so that new records can be attributed to a specific managed account. Existing rows default to `NULL` (no attribution required). + +### `purchase_history` + +```sql +ALTER TABLE purchase_history + ADD COLUMN cloud_account_id UUID REFERENCES cloud_accounts(id) ON DELETE SET NULL; + +CREATE INDEX idx_purchase_history_cloud_account + ON purchase_history(cloud_account_id) WHERE cloud_account_id IS NOT NULL; +``` + +### `purchase_executions` + +```sql +ALTER TABLE purchase_executions + ADD COLUMN cloud_account_id UUID REFERENCES cloud_accounts(id) ON DELETE SET NULL; + +CREATE INDEX idx_purchase_executions_cloud_account + ON purchase_executions(cloud_account_id) WHERE cloud_account_id IS NOT NULL; +``` + +### `savings_snapshots` + +```sql +ALTER TABLE savings_snapshots + ADD COLUMN cloud_account_id UUID REFERENCES cloud_accounts(id) ON DELETE SET NULL; +``` + +### `ri_exchange_history` + +```sql +ALTER TABLE ri_exchange_history + ADD COLUMN cloud_account_id UUID REFERENCES cloud_accounts(id) ON DELETE SET NULL; +``` + +**Note on `account_id VARCHAR(20)`**: These columns are kept as-is. They store the raw cloud-provider account identifier (AWS 12-digit ID, Azure subscription GUID, etc.) and are still populated for all new records. `cloud_account_id` is the normalized FK into CUDly's own registry. + +--- + +## Migration File Structure + +``` +internal/database/postgres/migrations/ + 000011_cloud_accounts.up.sql ← creates the 4 new tables, adds FK columns, creates indexes + 000011_cloud_accounts.down.sql ← drops indexes, removes FK columns, drops tables in reverse order +``` + +Down migration order (reverse dependency): + +1. `DROP TABLE plan_accounts` +2. `DROP TABLE account_service_overrides` +3. `DROP TABLE account_credentials` +4. Remove FK columns from `purchase_history`, `purchase_executions`, `savings_snapshots`, `ri_exchange_history` +5. `DROP TABLE cloud_accounts` + +--- + +## Go Type Additions (`internal/config/types.go`) + +```go +// CloudAccount represents a single managed cloud account/subscription/project. +type CloudAccount struct { + ID string `json:"id"` + Name string `json:"name"` + Description string `json:"description,omitempty"` + ContactEmail string `json:"contact_email,omitempty"` + Enabled bool `json:"enabled"` + Provider string `json:"provider"` + ExternalID string `json:"external_id"` + + // AWS + AWSAuthMode string `json:"aws_auth_mode,omitempty"` + AWSRoleARN string `json:"aws_role_arn,omitempty"` + AWSExternalID string `json:"aws_external_id,omitempty"` + AWSBastionID string `json:"aws_bastion_id,omitempty"` + AWSIsOrgRoot bool `json:"aws_is_org_root,omitempty"` + + // Azure + AzureSubscriptionID string `json:"azure_subscription_id,omitempty"` + AzureTenantID string `json:"azure_tenant_id,omitempty"` + AzureClientID string `json:"azure_client_id,omitempty"` + + // GCP + GCPProjectID string `json:"gcp_project_id,omitempty"` + GCPClientEmail string `json:"gcp_client_email,omitempty"` + + // Derived (not stored) + CredentialsConfigured bool `json:"credentials_configured"` + BastionAccountName string `json:"bastion_account_name,omitempty"` + + CreatedAt time.Time `json:"created_at"` + UpdatedAt time.Time `json:"updated_at"` + CreatedBy string `json:"created_by,omitempty"` +} + +// CloudAccountFilter for ListCloudAccounts queries. +type CloudAccountFilter struct { + Provider *string + Enabled *bool + Search string // substring match on name or external_id + BastionID *string // return accounts whose aws_bastion_id = *BastionID; used by delete handler +} + +// AccountServiceOverride is a sparse per-account override on top of the global ServiceConfig. +// Nil pointer fields inherit the global value. +type AccountServiceOverride struct { + ID string `json:"id"` + AccountID string `json:"account_id"` + Provider string `json:"provider"` + Service string `json:"service"` + Enabled *bool `json:"enabled,omitempty"` + Term *int `json:"term,omitempty"` + Payment *string `json:"payment,omitempty"` + Coverage *float64 `json:"coverage,omitempty"` + RampSchedule *string `json:"ramp_schedule,omitempty"` + IncludeEngines []string `json:"include_engines,omitempty"` + ExcludeEngines []string `json:"exclude_engines,omitempty"` + IncludeRegions []string `json:"include_regions,omitempty"` + ExcludeRegions []string `json:"exclude_regions,omitempty"` + IncludeTypes []string `json:"include_types,omitempty"` + ExcludeTypes []string `json:"exclude_types,omitempty"` + CreatedAt time.Time `json:"created_at"` + UpdatedAt time.Time `json:"updated_at"` +} +``` diff --git a/specs/multi-account-execution/frontend.md b/specs/multi-account-execution/frontend.md new file mode 100644 index 00000000..af3be511 --- /dev/null +++ b/specs/multi-account-execution/frontend.md @@ -0,0 +1,571 @@ + +# Frontend Design + +## Overview + +The frontend changes fall into four areas: + +1. **Settings Tab** — account CRUD UI per provider (AWS / Azure / GCP) +2. **Service Override UI** — per-account overrides for service defaults +3. **Filter bar additions** — account multi-select in Dashboard, Recommendations, History, Plans +4. **Plan account association** — account selector in plan create/edit modal + +All changes are backward-compatible: if no accounts are configured, the UI behaves identically to today. + +--- + +## State Model Changes + +### `frontend/src/api/types.ts` + +Add `CloudAccount` here — it is an API response type, consistent with all other API response shapes (`AzureCredentials`, `GCPCredentials`, `Plan`, `PurchaseHistory`, etc.) that live in `frontend/src/api/types.ts`. + +```typescript +// New type — add to frontend/src/api/types.ts +export interface CloudAccount { + id: string; + name: string; + description?: string; + contact_email?: string; + enabled: boolean; + provider: Provider; + external_id: string; + + // AWS-specific (may be absent for Azure/GCP) + aws_auth_mode?: 'access_keys' | 'role_arn' | 'bastion'; + aws_role_arn?: string; + aws_external_id?: string; + aws_bastion_id?: string; + aws_is_org_root?: boolean; + bastion_account_name?: string; // resolved display name + + // Azure-specific + azure_subscription_id?: string; + azure_tenant_id?: string; + azure_client_id?: string; + + // GCP-specific + gcp_project_id?: string; + gcp_client_email?: string; + + credentials_configured: boolean; + created_at: string; + updated_at: string; +} +``` + +### `frontend/src/types.ts` + +Only the `AppState` interface needs modification here (not `CloudAccount`): + +```typescript +// Add currentAccountIDs to AppState +export interface AppState { + currentUser: api.User | null; + currentProvider: api.Provider | 'all'; + currentAccountIDs: string[]; // ← NEW: selected account UUIDs; [] = all accounts + currentRecommendations: api.Recommendation[]; + selectedRecommendations: Set; + savingsChart: Chart | null; +} +``` + +### `frontend/src/state.ts` + +```typescript +// Add getter/setter pair +export function getCurrentAccountIDs(): string[] { + return state.currentAccountIDs; +} +export function setCurrentAccountIDs(ids: string[]): void { + state.currentAccountIDs = ids; +} +``` + +--- + +## New Types: `frontend/src/api/types.ts` additions + +Add these alongside `CloudAccount`: + +```typescript +// Request/response types for accounts API — add to frontend/src/api/types.ts + +export interface CreateAccountRequest { + name: string; + description?: string; + contact_email?: string; + enabled: boolean; + provider: Provider; + external_id: string; + // AWS + aws_auth_mode?: 'access_keys' | 'role_arn' | 'bastion'; + aws_role_arn?: string; + aws_external_id?: string; + aws_bastion_id?: string; + aws_is_org_root?: boolean; + // Azure + azure_subscription_id?: string; + azure_tenant_id?: string; + azure_client_id?: string; + // GCP + gcp_project_id?: string; + // Inline credentials (optional — inferred from auth_mode; no "type" field needed here) + credentials?: { + access_key_id?: string; // aws_auth_mode=access_keys + secret_access_key?: string; // aws_auth_mode=access_keys + client_secret?: string; // azure + service_account_json?: string; // gcp + }; +} + +// provider and external_id are immutable after creation — server rejects changes to them. +// Omit them from UpdateAccountRequest to make the type honest and prevent accidental sends. +export type UpdateAccountRequest = Omit; + +export interface AccountCredentials { + type: 'aws_access_keys' | 'azure_client_secret' | 'gcp_service_account'; + access_key_id?: string; + secret_access_key?: string; + client_secret?: string; + service_account_json?: string; +} + +export interface AccountServiceOverride { + id: string; + account_id: string; + provider: string; + service: string; + enabled?: boolean; + term?: number; + payment?: string; + coverage?: number; + ramp_schedule?: string; + include_regions?: string[]; + exclude_regions?: string[]; + include_engines?: string[]; + exclude_engines?: string[]; + include_types?: string[]; + exclude_types?: string[]; +} + +export type ServiceOverrideFields = Omit; + +export interface OrgDiscoveryResult { + discovered: number; + created: number; + skipped: number; + accounts: Array<{ name: string; external_id: string; created: boolean }>; +} + +export interface PlanAccountsResponse { + account_ids: string[]; + accounts: Array>; +} +``` + +--- + +## New API Module: `frontend/src/api/accounts.ts` + +`listAccounts` unwraps the `{ "accounts": [...] }` API wrapper and returns the array directly +(same pattern as existing API functions). + +```typescript +export async function listAccounts(provider?: Provider): Promise + // calls GET /api/accounts?provider=... → unwraps response.accounts +export async function createAccount(data: CreateAccountRequest): Promise +export async function updateAccount(id: string, data: UpdateAccountRequest): Promise +export async function deleteAccount(id: string): Promise +export async function saveAccountCredentials(id: string, creds: AccountCredentials): Promise +export async function testAccountCredentials(id: string): Promise<{ ok: boolean; error?: string; caller_identity?: string }> +export async function listAccountServiceOverrides(id: string): Promise + // unwraps response.overrides +export async function saveAccountServiceOverride(id: string, provider: string, service: string, override: Partial): Promise +export async function deleteAccountServiceOverride(id: string, provider: string, service: string): Promise +export async function discoverOrgAccounts(orgRootAccountId: string): Promise +export async function getPlanAccounts(planId: string): Promise +export async function setPlanAccounts(planId: string, accountIds: string[]): Promise +``` + +--- + +## Settings Tab: Account Management Section + +### Placement in `index.html` + +Inside each provider fieldset (`#aws-settings`, `#azure-settings`, `#gcp-settings`), add an **Accounts** sub-section **before** the existing "Service Defaults" heading: + +```html + +

+ Accounts + +

+ +``` + +Repeat for `#azure-account-list` and `#gcp-account-list`. + +### Account Row (rendered by JS via DOM methods — no raw HTML injection) + +Each row is built with `document.createElement` / `textContent` to prevent XSS: + +``` +[Account Name] [external_id] [auth-mode badge] [✓ Credentials set | ⚠ No credentials] + [Test] [Credentials] [Overrides] [Edit] [Delete] +``` + +Button `data-account-id` attributes drive click handlers (no inline event attributes). + +--- + +## Modals + +### Notes on existing modal patterns + +Existing modals (`#azure-creds-modal`, `#gcp-creds-modal`) use class `modal hidden` (NOT `modal-overlay hidden`). +Their close buttons use id pattern `close-X-modal-btn` with class `btn-secondary`. +New account modals must follow the same pattern to be consistent with `app.ts` event binding. + +### AWS Account Modal (`#aws-account-modal`) + +Dynamic form — shows different credential fields based on selected auth mode via CSS class toggling. + +``` +┌─ Add AWS Account ──────────────────────────────────────────┐ +│ Name * [_________________________] │ +│ AWS Account ID * [_________________________] │ +│ Description [_________________________] │ +│ Contact Email [_________________________] │ +│ Auth Mode * [▼ Role ARN ] │ +│ ─── Role ARN ───────────────────────────────────────── │ +│ Role ARN * [arn:aws:iam::... ] │ +│ External ID [_________________________] │ +│ ─── (if Access Keys selected) ──────────────────── │ +│ Access Key ID * [_________________________] │ +│ Secret Access Key * [_________________________] │ +│ ─── (if Bastion selected) ───────────────────────── │ +│ Bastion Account * [▼ Select bastion account] │ +│ Role ARN in Target * [arn:aws:iam::... ] │ +│ External ID [_________________________] │ +│ │ +│ [ ] This is an AWS Organizations root account │ +│ (enables member account discovery) │ +│ │ +│ [Cancel] [Save Account] │ +└─────────────────────────────────────────────────────────────┘ +``` + +Modal HTML structure (matching existing pattern — class `modal hidden`, close button id `close-X-modal-btn`): + +```html + +``` + +Auth mode sections use `hidden` CSS class toggled by JS (same pattern as provider settings sections in `settings.ts`). + +### Azure Subscription Modal (`#azure-account-modal`) + +``` +┌─ Add Azure Subscription ───────────────────────────────────┐ +│ Name * [_________________________] │ +│ Subscription ID * [xxxxxxxx-xxxx-xxxx-xxxx-] │ +│ Tenant ID * [yyyyyyyy-xxxx-xxxx-xxxx-] │ +│ Client ID * [zzzzzzzz-xxxx-xxxx-xxxx-] │ +│ Client Secret * [_________________________] │ +│ Description [_________________________] │ +│ Contact Email [_________________________] │ +│ [Cancel] [Save Subscription] │ +└─────────────────────────────────────────────────────────────┘ +``` + +HTML: `