From d7f3fe6e3d791fc7268d19e2dc6cee629dc4a4e3 Mon Sep 17 00:00:00 2001 From: Colin Walters Date: Tue, 25 Nov 2025 09:26:35 -0500 Subject: [PATCH] docs: adopt AGENTS.md standard for agent instructions Rename CLAUDE.md to AGENTS.md to align with the community AGENTS.md convention (https://agents.md/). This standardized format makes the repository's agent instructions more discoverable and consistent with other projects using AI coding assistants. My real motivation here is I want to support other agentic systems and not having this project hardcoded to Claude Code, which is proprietary software. Assisted-by: Claude Code (Sonnet 4.5) Signed-off-by: Colin Walters --- .claude/CLAUDE.md | 1 + .claude/amber-config.yml | 2 +- .github/ISSUE_TEMPLATE/amber-refactor.yml | 4 +- .../ISSUE_TEMPLATE/amber-test-coverage.yml | 2 +- .github/workflows/amber-issue-handler.yml | 6 +- .github/workflows/claude-code-review.yml | 4 +- .specify/memory/constitution.md | 4 +- .../memory/constitution_update_checklist.md | 2 +- .specify/scripts/bash/update-agent-context.sh | 4 +- AGENTS.md | 1066 ++++++++++++++++- AMBER_SETUP.md | 2 +- CLAUDE.md | 1063 +--------------- CONTRIBUTING.md | 4 +- README.md | 2 +- agents/amber.md | 26 +- components/backend/README.md | 2 +- components/operator/README.md | 2 +- .../runners/claude-code-runner/pyproject.toml | 1 - docs/amber-automation.md | 8 +- docs/amber-quickstart.md | 6 +- .../amber-implementation.md | 24 +- docs/labs/basic/lab-1-first-rfe.md | 2 +- docs/labs/index.md | 2 +- docs/reference/index.md | 4 +- docs/user-guide/getting-started.md | 2 +- docs/user-guide/index.md | 2 +- docs/user-guide/working-with-amber.md | 10 +- 27 files changed, 1130 insertions(+), 1127 deletions(-) create mode 120000 .claude/CLAUDE.md mode change 120000 => 100644 AGENTS.md mode change 100644 => 120000 CLAUDE.md diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md new file mode 120000 index 000000000..be77ac83a --- /dev/null +++ b/.claude/CLAUDE.md @@ -0,0 +1 @@ +../AGENTS.md \ No newline at end of file diff --git a/.claude/amber-config.yml b/.claude/amber-config.yml index fe7cf2062..9795d6ac7 100644 --- a/.claude/amber-config.yml +++ b/.claude/amber-config.yml @@ -132,7 +132,7 @@ automation_policies: # Constitution compliance checks constitution_checks: enabled: true - reference_file: "CLAUDE.md" + reference_file: "AGENTS.md" principles: - id: "principle_v_modularity" diff --git a/.github/ISSUE_TEMPLATE/amber-refactor.yml b/.github/ISSUE_TEMPLATE/amber-refactor.yml index fa6b4c63d..4ee286443 100644 --- a/.github/ISSUE_TEMPLATE/amber-refactor.yml +++ b/.github/ISSUE_TEMPLATE/amber-refactor.yml @@ -50,7 +50,7 @@ body: - Must maintain backward compatibility - No breaking changes to API - All existing tests must pass - - Follow patterns in CLAUDE.md + - Follow patterns in AGENTS.md validations: required: true @@ -70,7 +70,7 @@ body: attributes: label: Confirmation options: - - label: I have reviewed CLAUDE.md standards + - label: I have reviewed AGENTS.md standards required: true - label: Backward compatibility is required required: true diff --git a/.github/ISSUE_TEMPLATE/amber-test-coverage.yml b/.github/ISSUE_TEMPLATE/amber-test-coverage.yml index 94f3662b7..48e09aec4 100644 --- a/.github/ISSUE_TEMPLATE/amber-test-coverage.yml +++ b/.github/ISSUE_TEMPLATE/amber-test-coverage.yml @@ -66,7 +66,7 @@ body: attributes: label: Confirmation options: - - label: Tests should follow patterns in CLAUDE.md + - label: Tests should follow patterns in AGENTS.md required: true - label: Table-driven tests for Go, pytest for Python required: true diff --git a/.github/workflows/amber-issue-handler.yml b/.github/workflows/amber-issue-handler.yml index 82ce87bda..59216abd8 100644 --- a/.github/workflows/amber-issue-handler.yml +++ b/.github/workflows/amber-issue-handler.yml @@ -134,7 +134,7 @@ jobs: ### For `test-coverage` type: 1. Analyze current test coverage for specified files 2. Identify untested code paths - 3. Write contract tests following project standards (see CLAUDE.md) + 3. Write contract tests following project standards (see AGENTS.md) 4. Ensure tests follow table-driven test pattern (Go) or pytest patterns (Python) 5. Verify all new tests pass @@ -146,7 +146,7 @@ jobs: ## Requirements - - Follow all standards in `CLAUDE.md` + - Follow all standards in AGENTS.md - Use conventional commit format: `type(scope): message` - Run all linters BEFORE committing: - Go: `gofmt -w .`, `golangci-lint run` @@ -362,7 +362,7 @@ jobs: ### Pre-merge Checklist - [ ] All linters pass - [ ] All tests pass - - [ ] Changes follow project conventions (CLAUDE.md) + - [ ] Changes follow project conventions (AGENTS.md) - [ ] No scope creep beyond issue description ### Reviewer Notes diff --git a/.github/workflows/claude-code-review.yml b/.github/workflows/claude-code-review.yml index 69d913a2b..d7e8ddb89 100644 --- a/.github/workflows/claude-code-review.yml +++ b/.github/workflows/claude-code-review.yml @@ -90,11 +90,11 @@ jobs: Perform a comprehensive code review with the following focus areas: 1. **Code Quality & Best Practices** - - Follow repository's CLAUDE.md guidelines + - Follow repository's AGENTS.md guidelines - Clean code principles and design patterns - Proper error handling and edge cases - Code readability and maintainability - - TypeScript/Go best practices (see CLAUDE.md) + - TypeScript/Go best practices (see AGENTS.md) 2. **Security** - Potential security vulnerabilities diff --git a/.specify/memory/constitution.md b/.specify/memory/constitution.md index d573eccce..7957b5f4c 100644 --- a/.specify/memory/constitution.md +++ b/.specify/memory/constitution.md @@ -334,7 +334,7 @@ Replace usage of "vTeam" with "ACP" (Ambient Code Platform) where it is safe and **Incremental Approach**: -- Update documentation first (README, CLAUDE.md, docs/) +- Update documentation first (README, AGENTS.md, docs/) - Update UI text in new features - Use ACP naming in new code modules - Do NOT perform mass renames - update organically during feature work @@ -409,7 +409,7 @@ npm run build # Must pass with 0 errors, 0 warnings Runtime development guidance is maintained in: -- `/CLAUDE.md` for Claude Code development +- `/AGENTS.md` for Claude Code development - Component-specific README files - MkDocs documentation in `/docs` diff --git a/.specify/memory/constitution_update_checklist.md b/.specify/memory/constitution_update_checklist.md index 7f15d7ff6..acfb88acb 100644 --- a/.specify/memory/constitution_update_checklist.md +++ b/.specify/memory/constitution_update_checklist.md @@ -10,7 +10,7 @@ When amending the constitution (`/memory/constitution.md`), ensure all dependent - [ ] `/templates/tasks-template.md` - Update if new task types needed - [ ] `/.claude/commands/plan.md` - Update if planning process changes - [ ] `/.claude/commands/tasks.md` - Update if task generation affected -- [ ] `/CLAUDE.md` - Update runtime development guidelines +- [ ] `/AGENTS.md` - Update runtime development guidelines ### Article-specific updates: diff --git a/.specify/scripts/bash/update-agent-context.sh b/.specify/scripts/bash/update-agent-context.sh index 2a44c68a1..75341fc05 100755 --- a/.specify/scripts/bash/update-agent-context.sh +++ b/.specify/scripts/bash/update-agent-context.sh @@ -58,8 +58,8 @@ eval $(get_feature_paths) NEW_PLAN="$IMPL_PLAN" # Alias for compatibility with existing code AGENT_TYPE="${1:-}" -# Agent-specific file paths -CLAUDE_FILE="$REPO_ROOT/CLAUDE.md" +# Agent-specific file paths +CLAUDE_FILE="$REPO_ROOT/AGENTS.md" GEMINI_FILE="$REPO_ROOT/GEMINI.md" COPILOT_FILE="$REPO_ROOT/.github/copilot-instructions.md" CURSOR_FILE="$REPO_ROOT/.cursor/rules/specify-rules.mdc" diff --git a/AGENTS.md b/AGENTS.md deleted file mode 120000 index 681311eb9..000000000 --- a/AGENTS.md +++ /dev/null @@ -1 +0,0 @@ -CLAUDE.md \ No newline at end of file diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 000000000..973048e68 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,1065 @@ +# AGENTS.md + +This file implements [AGENTS.md](https://agents.md/) with instructions for coding agents +when working with code in this repository. + +## Project Overview + +The **Ambient Code Platform** is a Kubernetes-native AI automation platform that orchestrates intelligent agentic sessions through containerized microservices. The platform enables AI-powered automation for analysis, research, development, and content creation tasks via a modern web interface. + +> **Note:** This project was formerly known as "vTeam". Technical artifacts (image names, namespaces, API groups) still use "vteam" for backward compatibility. + +> **Note:** At the current time, much of this project is hardcoded to Claude Code. Further references below continue that. It will be a wider project to decouple this project from Claude Code specifically, but having the instructions for this project use the generic AGENTS.md is the start of that decoupling. + +### Amber Background Agent + +The platform includes **Amber**, a background agent that automates common development tasks via GitHub Issues. Team members can trigger automated fixes, refactoring, and test additions without requiring direct access to a specific agentic AI system like Gemini CLI, Goose or Claude Code. + +**Quick Links**: + +- [Amber Quickstart](docs/amber-quickstart.md) - Get started in 5 minutes +- [Full Documentation](docs/amber-automation.md) - Complete automation guide +- [Amber Config](.claude/amber-config.yml) - Automation policies + +**Common Workflows**: + +- πŸ€– **Auto-Fix** (label: `amber:auto-fix`): Formatting, linting, trivial fixes +- πŸ”§ **Refactoring** (label: `amber:refactor`): Break large files, extract patterns +- πŸ§ͺ **Test Coverage** (label: `amber:test-coverage`): Add missing tests + +### Core Architecture + +The system follows a Kubernetes-native pattern with Custom Resources, Operators, and Job execution: + +1. **Frontend** (NextJS + Shadcn): Web UI for session management and monitoring +2. **Backend API** (Go + Gin): REST API managing Kubernetes Custom Resources with multi-tenant project isolation +3. **Agentic Operator** (Go): Kubernetes controller watching CRs and creating Jobs +4. **Claude Code Runner** (Python): Job pods executing Claude Code CLI with multi-agent collaboration + +### Agentic Session Flow + +``` +User Creates Session β†’ Backend Creates CR β†’ Operator Spawns Job β†’ +Pod Runs Claude CLI β†’ Results Stored in CR β†’ UI Displays Progress +``` + +## Memory System - Loadable Context + +This repository uses a structured **memory system** to provide targeted, loadable context instead of relying solely on this comprehensive AGENTS.md file. + +### Quick Reference + +**Load these files when working in specific areas:** + +| Task Type | Context File | Architecture View | Pattern File | +|-----------|--------------|-------------------|--------------| +| **Backend API work** | `.claude/context/backend-development.md` | `repomix-analysis/03-architecture-only.xml` | `.claude/patterns/k8s-client-usage.md` | +| **Frontend UI work** | `.claude/context/frontend-development.md` | `repomix-analysis/03-architecture-only.xml` | `.claude/patterns/react-query-usage.md` | +| **Security review** | `.claude/context/security-standards.md` | `repomix-analysis/03-architecture-only.xml` | `.claude/patterns/error-handling.md` | +| **Architecture questions** | - | `repomix-analysis/03-architecture-only.xml` | See ADRs below | + +**Note:** We use a single repomix architecture view (grade 8.8/10, 187K tokens) for all tasks. See `.claude/repomix-guide.md` for details. + +### Available Memory Files + +**1. Context Files** (`.claude/context/`) + +- `backend-development.md` - Go backend, K8s integration, handler patterns +- `frontend-development.md` - NextJS, Shadcn UI, React Query patterns +- `security-standards.md` - Auth, RBAC, token handling, security patterns + +**2. Architectural Decision Records** (`docs/adr/`) + +- Documents WHY decisions were made, not just WHAT +- `0001-kubernetes-native-architecture.md` +- `0002-user-token-authentication.md` +- `0003-multi-repo-support.md` +- `0004-go-backend-python-runner.md` +- `0005-nextjs-shadcn-react-query.md` + +**3. Code Pattern Catalog** (`.claude/patterns/`) + +- `error-handling.md` - Consistent error patterns (backend, operator, runner) +- `k8s-client-usage.md` - When to use user token vs. service account +- `react-query-usage.md` - Data fetching patterns (queries, mutations, caching) + +**4. Repomix Usage Guide** (`.claude/repomix-guide.md`) + +- Guide for using the architecture view effectively +- Why we use a single view approach (vs. 7 views) + +**5. Decision Log** (`docs/decisions.md`) + +- Lightweight chronological record of major decisions +- Links to ADRs, code, and context files + +### Example Usage + +``` +"Claude, load the architecture view (repomix-analysis/03-architecture-only.xml) and the +backend-development context file, then help me add a new endpoint for listing RFE workflows." +``` + +``` +"Claude, load the architecture view and security-standards context file, +then review this PR for token handling issues." +``` + +``` +"Claude, check ADR-0002 (User Token Authentication) and explain why we use user tokens +instead of service accounts for API operations." +``` + +## Development Commands + +### Quick Start - Local Development + +**Single command setup with OpenShift Local (CRC):** + +```bash +# Prerequisites: brew install crc +# Get free Red Hat pull secret from console.redhat.com/openshift/create/local +make dev-start + +# Access at https://vteam-frontend-vteam-dev.apps-crc.testing +``` + +**Hot-reloading development:** + +```bash +# Terminal 1 +DEV_MODE=true make dev-start + +# Terminal 2 (separate terminal) +make dev-sync +``` + +### Building Components + +```bash +# Build all container images (default: docker, linux/amd64) +make build-all + +# Build with podman +make build-all CONTAINER_ENGINE=podman + +# Build for ARM64 +make build-all PLATFORM=linux/arm64 + +# Build individual components +make build-frontend +make build-backend +make build-operator +make build-runner + +# Push to registry +make push-all REGISTRY=quay.io/your-username +``` + +### Deployment + +```bash +# Deploy with default images from quay.io/ambient_code +make deploy + +# Deploy to custom namespace +make deploy NAMESPACE=my-namespace + +# Deploy with custom images +cd components/manifests +cp env.example .env +# Edit .env with ANTHROPIC_API_KEY and CONTAINER_REGISTRY +./deploy.sh + +# Clean up deployment +make clean +``` + +### Component Development + +See component-specific documentation for detailed development commands: + +- **Backend** (`components/backend/README.md`): Go API development, testing, linting +- **Frontend** (`components/frontend/README.md`): NextJS development, see also `DESIGN_GUIDELINES.md` +- **Operator** (`components/operator/README.md`): Operator development, watch patterns +- **Claude Code Runner** (`components/runners/claude-code-runner/README.md`): Python runner development + +**Common commands**: + +```bash +make build-all # Build all components +make deploy # Deploy to cluster +make test # Run tests +make lint # Lint code +``` + +### Documentation + +```bash +# Install documentation dependencies +pip install -r requirements-docs.txt + +# Serve locally at http://127.0.0.1:8000 +mkdocs serve + +# Build static site +mkdocs build + +# Deploy to GitHub Pages +mkdocs gh-deploy + +# Markdown linting +markdownlint docs/**/*.md +``` + +### Local Development Helpers + +```bash +# View logs +make dev-logs # Both backend and frontend +make dev-logs-backend # Backend only +make dev-logs-frontend # Frontend only +make dev-logs-operator # Operator only + +# Operator management +make dev-restart-operator # Restart operator deployment +make dev-operator-status # Show operator status and events + +# Cleanup +make dev-stop # Stop processes, keep CRC running +make dev-stop-cluster # Stop processes and shutdown CRC +make dev-clean # Stop and delete OpenShift project + +# Testing +make dev-test # Run smoke tests +make dev-test-operator # Test operator only +``` + +## Key Architecture Patterns + +### Custom Resource Definitions (CRDs) + +The platform defines three primary CRDs: + +1. **AgenticSession** (`agenticsessions.vteam.ambient-code`): Represents an AI execution session + - Spec: prompt, repos (multi-repo support), interactive mode, timeout, model selection + - Status: phase, startTime, completionTime, results, error messages, per-repo push status + +2. **ProjectSettings** (`projectsettings.vteam.ambient-code`): Project-scoped configuration + - Manages API keys, default models, timeout settings + - Namespace-isolated for multi-tenancy + +3. **RFEWorkflow** (`rfeworkflows.vteam.ambient-code`): RFE (Request For Enhancement) workflows + - 7-step agent council process for engineering refinement + - Agent roles: PM, Architect, Staff Engineer, PO, Team Lead, Team Member, Delivery Owner + +### Multi-Repo Support + +AgenticSessions support operating on multiple repositories simultaneously: + +- Each repo has required `input` (URL, branch) and optional `output` (fork/target) configuration +- `mainRepoIndex` specifies which repo is the Claude working directory (default: 0) +- Per-repo status tracking: `pushed` or `abandoned` + +### Interactive vs Batch Mode + +- **Batch Mode** (default): Single prompt execution with timeout +- **Interactive Mode** (`interactive: true`): Long-running chat sessions using inbox/outbox files + +### Backend API Structure + +The Go backend (`components/backend/`) implements: + +- **Project-scoped endpoints**: `/api/projects/:project/*` for namespaced resources +- **Multi-tenant isolation**: Each project maps to a Kubernetes namespace +- **WebSocket support**: Real-time session updates via `websocket_messaging.go` +- **Git operations**: Repository cloning, forking, PR creation via `git.go` +- **RBAC integration**: OpenShift OAuth for authentication + +Main handler logic in `handlers.go` (3906 lines) manages: + +- Project CRUD operations +- AgenticSession lifecycle +- ProjectSettings management +- RFE workflow orchestration + +### Operator Reconciliation Loop + +The Kubernetes operator (`components/operator/`) watches for: + +- AgenticSession creation/updates β†’ spawns Jobs with runner pods +- Job completion β†’ updates CR status with results +- Timeout handling and cleanup + +### Runner Execution + +The Claude Code runner (`components/runners/claude-code-runner/`) provides: + +- Claude Code SDK integration (`claude-code-sdk>=0.0.23`) +- Workspace synchronization via PVC proxy +- Multi-agent collaboration capabilities +- Anthropic API streaming (`anthropic>=0.68.0`) + +## Configuration Standards + +### Python + +- **Virtual environments**: Always use `python -m venv venv` or `uv venv` +- **Package manager**: Prefer `uv` over `pip` +- **Formatting**: black (double quotes) +- **Import sorting**: isort with black profile +- **Linting**: flake8 (ignore E203, W503) + +### Go + +- **Formatting**: `go fmt ./...` (enforced) +- **Linting**: golangci-lint (install via `make install-tools`) +- **Testing**: Table-driven tests with subtests +- **Error handling**: Explicit error returns, no panic in production code + +### Container Images + +- **Default registry**: `quay.io/ambient_code` +- **Image tags**: Component-specific (vteam_frontend, vteam_backend, vteam_operator, vteam_claude_runner) +- **Platform**: Default `linux/amd64`, ARM64 supported via `PLATFORM=linux/arm64` +- **Build tool**: Docker or Podman (`CONTAINER_ENGINE=podman`) + +### Git Workflow + +- **Default branch**: `main` +- **Feature branches**: Required for development +- **Commit style**: Conventional commits (squashed on merge) +- **Branch verification**: Always check current branch before file modifications + +### Kubernetes/OpenShift + +- **Default namespace**: `ambient-code` (production), `vteam-dev` (local dev) +- **CRD group**: `vteam.ambient-code` +- **API version**: `v1alpha1` (current) +- **RBAC**: Namespace-scoped service accounts with minimal permissions + +## Backend and Operator Development Standards + +**IMPORTANT**: When working on backend (`components/backend/`) or operator (`components/operator/`) code, you MUST follow these strict guidelines based on established patterns in the codebase. + +### Critical Rules (Never Violate) + +1. **User Token Authentication Required** + - FORBIDDEN: Using backend service account for user-initiated API operations + - REQUIRED: Always use `GetK8sClientsForRequest(c)` to get user-scoped K8s clients + - REQUIRED: Return `401 Unauthorized` if user token is missing or invalid + - Exception: Backend service account ONLY for CR writes and token minting (handlers/sessions.go:227, handlers/sessions.go:449) + +2. **Never Panic in Production Code** + - FORBIDDEN: `panic()` in handlers, reconcilers, or any production path + - REQUIRED: Return explicit errors with context: `return fmt.Errorf("failed to X: %w", err)` + - REQUIRED: Log errors before returning: `log.Printf("Operation failed: %v", err)` + +3. **Token Security and Redaction** + - FORBIDDEN: Logging tokens, API keys, or sensitive headers + - REQUIRED: Redact tokens in logs using custom formatters (server/server.go:22-34) + - REQUIRED: Use `log.Printf("tokenLen=%d", len(token))` instead of logging token content + - Example: `path = strings.Split(path, "?")[0] + "?token=[REDACTED]"` + +4. **Type-Safe Unstructured Access** + - FORBIDDEN: Direct type assertions without checking: `obj.Object["spec"].(map[string]interface{})` + - REQUIRED: Use `unstructured.Nested*` helpers with three-value returns + - Example: `spec, found, err := unstructured.NestedMap(obj.Object, "spec")` + - REQUIRED: Check `found` before using values; handle type mismatches gracefully + +5. **OwnerReferences for Resource Lifecycle** + - REQUIRED: Set OwnerReferences on all child resources (Jobs, Secrets, PVCs, Services) + - REQUIRED: Use `Controller: boolPtr(true)` for primary owner + - FORBIDDEN: `BlockOwnerDeletion` (causes permission issues in multi-tenant environments) + - Pattern: (operator/internal/handlers/sessions.go:125-134, handlers/sessions.go:470-476) + +### Package Organization + +**Backend Structure** (`components/backend/`): + +``` +backend/ +β”œβ”€β”€ handlers/ # HTTP handlers grouped by resource +β”‚ β”œβ”€β”€ sessions.go # AgenticSession CRUD + lifecycle +β”‚ β”œβ”€β”€ projects.go # Project management +β”‚ β”œβ”€β”€ rfe.go # RFE workflows +β”‚ β”œβ”€β”€ helpers.go # Shared utilities (StringPtr, etc.) +β”‚ └── middleware.go # Auth, validation, RBAC +β”œβ”€β”€ types/ # Type definitions (no business logic) +β”‚ β”œβ”€β”€ session.go +β”‚ β”œβ”€β”€ project.go +β”‚ └── common.go +β”œβ”€β”€ server/ # Server setup, CORS, middleware +β”œβ”€β”€ k8s/ # K8s resource templates +β”œβ”€β”€ git/, github/ # External integrations +β”œβ”€β”€ websocket/ # Real-time messaging +β”œβ”€β”€ routes.go # HTTP route registration +└── main.go # Wiring, dependency injection +``` + +**Operator Structure** (`components/operator/`): + +``` +operator/ +β”œβ”€β”€ internal/ +β”‚ β”œβ”€β”€ config/ # K8s client init, config loading +β”‚ β”œβ”€β”€ types/ # GVR definitions, resource helpers +β”‚ β”œβ”€β”€ handlers/ # Watch handlers (sessions, namespaces, projectsettings) +β”‚ └── services/ # Reusable services (PVC provisioning, etc.) +└── main.go # Watch coordination +``` + +**Rules**: + +- Handlers contain HTTP/watch logic ONLY +- Types are pure data structures +- Business logic in separate service packages +- No cyclic dependencies between packages + +### Kubernetes Client Patterns + +**User-Scoped Clients** (for API operations): + +```go +// ALWAYS use for user-initiated operations (list, get, create, update, delete) +reqK8s, reqDyn := GetK8sClientsForRequest(c) +if reqK8s == nil { + c.JSON(http.StatusUnauthorized, gin.H{"error": "Invalid or missing token"}) + c.Abort() + return +} +// Use reqDyn for CR operations in user's authorized namespaces +list, err := reqDyn.Resource(gvr).Namespace(project).List(ctx, v1.ListOptions{}) +``` + +**Backend Service Account Clients** (limited use cases): + +```go +// ONLY use for: +// 1. Writing CRs after validation (handlers/sessions.go:417) +// 2. Minting tokens/secrets for runners (handlers/sessions.go:449) +// 3. Cross-namespace operations backend is authorized for +// Available as: DynamicClient, K8sClient (package-level in handlers/) +created, err := DynamicClient.Resource(gvr).Namespace(project).Create(ctx, obj, v1.CreateOptions{}) +``` + +**Never**: + +- ❌ Fall back to service account when user token is invalid +- ❌ Use service account for list/get operations on behalf of users +- ❌ Skip RBAC checks by using elevated permissions + +### Error Handling Patterns + +**Handler Errors**: + +```go +// Pattern 1: Resource not found +if errors.IsNotFound(err) { + c.JSON(http.StatusNotFound, gin.H{"error": "Session not found"}) + return +} + +// Pattern 2: Log + return error +if err != nil { + log.Printf("Failed to create session %s in project %s: %v", name, project, err) + c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to create session"}) + return +} + +// Pattern 3: Non-fatal errors (continue operation) +if err := updateStatus(...); err != nil { + log.Printf("Warning: status update failed: %v", err) + // Continue - session was created successfully +} +``` + +**Operator Errors**: + +```go +// Pattern 1: Resource deleted during processing (non-fatal) +if errors.IsNotFound(err) { + log.Printf("AgenticSession %s no longer exists, skipping", name) + return nil // Don't treat as error +} + +// Pattern 2: Retriable errors in watch loop +if err != nil { + log.Printf("Failed to create job: %v", err) + updateAgenticSessionStatus(ns, name, map[string]interface{}{ + "phase": "Error", + "message": fmt.Sprintf("Failed to create job: %v", err), + }) + return fmt.Errorf("failed to create job: %v", err) +} +``` + +**Never**: + +- ❌ Silent failures (always log errors) +- ❌ Generic error messages ("operation failed") +- ❌ Retrying indefinitely without backoff + +### Resource Management + +**OwnerReferences Pattern**: + +```go +// Always set owner when creating child resources +ownerRef := v1.OwnerReference{ + APIVersion: obj.GetAPIVersion(), // e.g., "vteam.ambient-code/v1alpha1" + Kind: obj.GetKind(), // e.g., "AgenticSession" + Name: obj.GetName(), + UID: obj.GetUID(), + Controller: boolPtr(true), // Only one controller per resource + // BlockOwnerDeletion: intentionally omitted (permission issues) +} + +// Apply to child resources +job := &batchv1.Job{ + ObjectMeta: v1.ObjectMeta{ + Name: jobName, + Namespace: namespace, + OwnerReferences: []v1.OwnerReference{ownerRef}, + }, + // ... +} +``` + +**Cleanup Patterns**: + +```go +// Rely on OwnerReferences for automatic cleanup, but delete explicitly when needed +policy := v1.DeletePropagationBackground +err := K8sClient.BatchV1().Jobs(ns).Delete(ctx, jobName, v1.DeleteOptions{ + PropagationPolicy: &policy, +}) +if err != nil && !errors.IsNotFound(err) { + log.Printf("Failed to delete job: %v", err) + return err +} +``` + +### Security Patterns + +**Token Handling**: + +```go +// Extract token from Authorization header +rawAuth := c.GetHeader("Authorization") +parts := strings.SplitN(rawAuth, " ", 2) +if len(parts) != 2 || !strings.EqualFold(parts[0], "Bearer") { + c.JSON(http.StatusUnauthorized, gin.H{"error": "invalid Authorization header"}) + return +} +token := strings.TrimSpace(parts[1]) + +// NEVER log the token itself +log.Printf("Processing request with token (len=%d)", len(token)) +``` + +**RBAC Enforcement**: + +```go +// Always check permissions before operations +ssar := &authv1.SelfSubjectAccessReview{ + Spec: authv1.SelfSubjectAccessReviewSpec{ + ResourceAttributes: &authv1.ResourceAttributes{ + Group: "vteam.ambient-code", + Resource: "agenticsessions", + Verb: "list", + Namespace: project, + }, + }, +} +res, err := reqK8s.AuthorizationV1().SelfSubjectAccessReviews().Create(ctx, ssar, v1.CreateOptions{}) +if err != nil || !res.Status.Allowed { + c.JSON(http.StatusForbidden, gin.H{"error": "Unauthorized"}) + return +} +``` + +**Container Security**: + +```go +// Always set SecurityContext for Job pods +SecurityContext: &corev1.SecurityContext{ + AllowPrivilegeEscalation: boolPtr(false), + ReadOnlyRootFilesystem: boolPtr(false), // Only if temp files needed + Capabilities: &corev1.Capabilities{ + Drop: []corev1.Capability{"ALL"}, // Drop all by default + }, +}, +``` + +### API Design Patterns + +**Project-Scoped Endpoints**: + +```go +// Standard pattern: /api/projects/:projectName/resource +r.GET("/api/projects/:projectName/agentic-sessions", ValidateProjectContext(), ListSessions) +r.POST("/api/projects/:projectName/agentic-sessions", ValidateProjectContext(), CreateSession) +r.GET("/api/projects/:projectName/agentic-sessions/:sessionName", ValidateProjectContext(), GetSession) + +// ValidateProjectContext middleware: +// 1. Extracts project from route param +// 2. Validates user has access via RBAC check +// 3. Sets project in context: c.Set("project", projectName) +``` + +**Middleware Chain**: + +```go +// Order matters: Recovery β†’ Logging β†’ CORS β†’ Identity β†’ Validation β†’ Handler +r.Use(gin.Recovery()) +r.Use(gin.LoggerWithFormatter(customRedactingFormatter)) +r.Use(cors.New(corsConfig)) +r.Use(forwardedIdentityMiddleware()) // Extracts X-Forwarded-User, etc. +r.Use(ValidateProjectContext()) // RBAC check +``` + +**Response Patterns**: + +```go +// Success with data +c.JSON(http.StatusOK, gin.H{"items": sessions}) + +// Success with created resource +c.JSON(http.StatusCreated, gin.H{"message": "Session created", "name": name, "uid": uid}) + +// Success with no content +c.Status(http.StatusNoContent) + +// Errors with structured messages +c.JSON(http.StatusBadRequest, gin.H{"error": "Invalid request"}) +``` + +### Operator Patterns + +**Watch Loop with Reconnection**: + +```go +func WatchAgenticSessions() { + gvr := types.GetAgenticSessionResource() + + for { // Infinite loop with reconnection + watcher, err := config.DynamicClient.Resource(gvr).Watch(ctx, v1.ListOptions{}) + if err != nil { + log.Printf("Failed to create watcher: %v", err) + time.Sleep(5 * time.Second) // Backoff before retry + continue + } + + log.Println("Watching for events...") + + for event := range watcher.ResultChan() { + switch event.Type { + case watch.Added, watch.Modified: + obj := event.Object.(*unstructured.Unstructured) + handleEvent(obj) + case watch.Deleted: + // Handle cleanup + } + } + + log.Println("Watch channel closed, restarting...") + watcher.Stop() + time.Sleep(2 * time.Second) + } +} +``` + +**Reconciliation Pattern**: + +```go +func handleEvent(obj *unstructured.Unstructured) error { + name := obj.GetName() + namespace := obj.GetNamespace() + + // 1. Verify resource still exists (avoid race conditions) + currentObj, err := getDynamicClient().Get(ctx, name, namespace) + if errors.IsNotFound(err) { + log.Printf("Resource %s no longer exists, skipping", name) + return nil // Not an error + } + + // 2. Get current phase/status + status, found, _ := unstructured.NestedMap(currentObj.Object, "status") + phase := getPhaseOrDefault(status, "Pending") + + // 3. Only reconcile if in expected state + if phase != "Pending" { + return nil // Already processed + } + + // 4. Create resources idempotently (check existence first) + if _, err := getResource(name); err == nil { + log.Printf("Resource %s already exists", name) + return nil + } + + // 5. Create and update status + createResource(...) + updateStatus(namespace, name, map[string]interface{}{"phase": "Creating"}) + + return nil +} +``` + +**Status Updates** (use UpdateStatus subresource): + +```go +func updateAgenticSessionStatus(namespace, name string, updates map[string]interface{}) error { + gvr := types.GetAgenticSessionResource() + + obj, err := config.DynamicClient.Resource(gvr).Namespace(namespace).Get(ctx, name, v1.GetOptions{}) + if errors.IsNotFound(err) { + log.Printf("Resource deleted, skipping status update") + return nil // Not an error + } + + if obj.Object["status"] == nil { + obj.Object["status"] = make(map[string]interface{}) + } + + status := obj.Object["status"].(map[string]interface{}) + for k, v := range updates { + status[k] = v + } + + // Use UpdateStatus subresource (requires /status permission) + _, err = config.DynamicClient.Resource(gvr).Namespace(namespace).UpdateStatus(ctx, obj, v1.UpdateOptions{}) + if errors.IsNotFound(err) { + return nil // Resource deleted during update + } + return err +} +``` + +**Goroutine Monitoring**: + +```go +// Start background monitoring (operator/internal/handlers/sessions.go:477) +go monitorJob(jobName, sessionName, namespace) + +// Monitoring loop checks both K8s Job status AND custom container status +func monitorJob(jobName, sessionName, namespace string) { + for { + time.Sleep(5 * time.Second) + + // 1. Check if parent resource still exists (exit if deleted) + if _, err := getSession(namespace, sessionName); errors.IsNotFound(err) { + log.Printf("Session deleted, stopping monitoring") + return + } + + // 2. Check Job status + job, err := K8sClient.BatchV1().Jobs(namespace).Get(ctx, jobName, v1.GetOptions{}) + if errors.IsNotFound(err) { + return + } + + // 3. Update status based on Job conditions + if job.Status.Succeeded > 0 { + updateStatus(namespace, sessionName, map[string]interface{}{ + "phase": "Completed", + "completionTime": time.Now().Format(time.RFC3339), + }) + cleanup(namespace, jobName) + return + } + } +} +``` + +### Pre-Commit Checklist for Backend/Operator + +Before committing backend or operator code, verify: + +- [ ] **Authentication**: All user-facing endpoints use `GetK8sClientsForRequest(c)` +- [ ] **Authorization**: RBAC checks performed before resource access +- [ ] **Error Handling**: All errors logged with context, appropriate HTTP status codes +- [ ] **Token Security**: No tokens or sensitive data in logs +- [ ] **Type Safety**: Used `unstructured.Nested*` helpers, checked `found` before using values +- [ ] **Resource Cleanup**: OwnerReferences set on all child resources +- [ ] **Status Updates**: Used `UpdateStatus` subresource, handled IsNotFound gracefully +- [ ] **Tests**: Added/updated tests for new functionality +- [ ] **Logging**: Structured logs with relevant context (namespace, resource name, etc.) +- [ ] **Code Quality**: Ran all linting checks locally (see below) + +**Run these commands before committing:** + +```bash +# Backend +cd components/backend +gofmt -l . # Check formatting (should output nothing) +go vet ./... # Detect suspicious constructs +golangci-lint run # Run comprehensive linting + +# Operator +cd components/operator +gofmt -l . +go vet ./... +golangci-lint run +``` + +**Auto-format code:** + +```bash +gofmt -w components/backend components/operator +``` + +**Note**: GitHub Actions will automatically run these checks on your PR. Fix any issues locally before pushing. + +### Common Mistakes to Avoid + +**Backend**: + +- ❌ Using service account client for user operations (always use user token) +- ❌ Not checking if user-scoped client creation succeeded +- ❌ Logging full token values (use `len(token)` instead) +- ❌ Not validating project access in middleware +- ❌ Type assertions without checking: `val := obj["key"].(string)` (use `val, ok := ...`) +- ❌ Not setting OwnerReferences (causes resource leaks) +- ❌ Treating IsNotFound as fatal error during cleanup +- ❌ Exposing internal error details to API responses (use generic messages) + +**Operator**: + +- ❌ Not reconnecting watch on channel close +- ❌ Processing events without verifying resource still exists +- ❌ Updating status on main object instead of /status subresource +- ❌ Not checking current phase before reconciliation (causes duplicate resources) +- ❌ Creating resources without idempotency checks +- ❌ Goroutine leaks (not exiting monitor when resource deleted) +- ❌ Using `panic()` in watch/reconciliation loops +- ❌ Not setting SecurityContext on Job pods + +### Reference Files + +Study these files to understand established patterns: + +**Backend**: + +- `components/backend/handlers/sessions.go` - Complete session lifecycle, user/SA client usage +- `components/backend/handlers/middleware.go` - Auth patterns, token extraction, RBAC +- `components/backend/handlers/helpers.go` - Utility functions (StringPtr, BoolPtr) +- `components/backend/types/common.go` - Type definitions +- `components/backend/server/server.go` - Server setup, middleware chain, token redaction +- `components/backend/routes.go` - HTTP route definitions and registration + +**Operator**: + +- `components/operator/internal/handlers/sessions.go` - Watch loop, reconciliation, status updates +- `components/operator/internal/config/config.go` - K8s client initialization +- `components/operator/internal/types/resources.go` - GVR definitions +- `components/operator/internal/services/infrastructure.go` - Reusable services + +## GitHub Actions CI/CD + +### Component Build Pipeline (`.github/workflows/components-build-deploy.yml`) + +- **Change detection**: Only builds modified components (frontend, backend, operator, claude-runner) +- **Multi-platform builds**: linux/amd64 and linux/arm64 +- **Registry**: Pushes to `quay.io/ambient_code` on main branch +- **PR builds**: Build-only, no push on pull requests + +### Automation Workflows + +- **amber-issue-handler.yml**: Amber background agent - automated fixes via GitHub issue labels (`amber:auto-fix`, `amber:refactor`, `amber:test-coverage`) or `/amber execute` command +- **amber-dependency-sync.yml**: Daily sync of dependency versions to Amber agent knowledge base +- **claude.yml**: Claude Code integration - responds to `@claude` mentions in issues/PRs +- **claude-code-review.yml**: Automated code reviews on pull requests + +### Code Quality Workflows + +- **go-lint.yml**: Go code formatting, vetting, and linting (gofmt, go vet, golangci-lint) +- **frontend-lint.yml**: Frontend code quality (ESLint, TypeScript checking, build validation) + +### Deployment & Testing Workflows + +- **prod-release-deploy.yaml**: Production releases with semver versioning and changelog generation +- **e2e.yml**: End-to-end Cypress testing in kind cluster (see Testing Strategy section) +- **test-local-dev.yml**: Local development environment validation + +### Utility Workflows + +- **docs.yml**: Deploy MkDocs documentation to GitHub Pages +- **dependabot-auto-merge.yml**: Auto-approve and merge Dependabot dependency updates + +## Testing Strategy + +### E2E Tests (Cypress + Kind) + +**Purpose**: Automated end-to-end testing of the complete vTeam stack in a Kubernetes environment. + +**Location**: `e2e/` + +**Quick Start**: + +```bash +make e2e-test CONTAINER_ENGINE=podman # Or docker +``` + +**What Gets Tested**: + +- βœ… Full vTeam deployment in kind (Kubernetes in Docker) +- βœ… Frontend UI rendering and navigation +- βœ… Backend API connectivity +- βœ… Project creation workflow (main user journey) +- βœ… Authentication with ServiceAccount tokens +- βœ… Ingress routing +- βœ… All pods deploy and become ready + +**What Doesn't Get Tested**: + +- ❌ OAuth proxy flow (uses direct token auth for simplicity) +- ❌ Session pod execution (requires Anthropic API key) +- ❌ Multi-user scenarios + +**Test Suite** (`e2e/cypress/e2e/vteam.cy.ts`): + +1. UI loads with token authentication +2. Navigate to new project page +3. Create a new project +4. List created projects +5. Backend API cluster-info endpoint + +**CI Integration**: Tests run automatically on all PRs via GitHub Actions (`.github/workflows/e2e.yml`) + +**Key Implementation Details**: + +- **Architecture**: Frontend without oauth-proxy, direct token injection via environment variables +- **Authentication**: Test user ServiceAccount with cluster-admin permissions +- **Token Handling**: Frontend deployment includes `OC_TOKEN`, `OC_USER`, `OC_EMAIL` env vars +- **Podman Support**: Auto-detects runtime, uses ports 8080/8443 for rootless Podman +- **Ingress**: Standard nginx-ingress with path-based routing + +**Adding New Tests**: + +```typescript +it('should test new feature', () => { + cy.visit('/some-page') + cy.contains('Expected Content').should('be.visible') + cy.get('#button').click() + // Auth header automatically injected via beforeEach interceptor +}) +``` + +**Debugging Tests**: + +```bash +cd e2e +source .env.test +CYPRESS_TEST_TOKEN="$TEST_TOKEN" CYPRESS_BASE_URL="http://vteam.local:8080" npm run test:headed +``` + +**Documentation**: See `e2e/README.md` and `docs/testing/e2e-guide.md` for comprehensive testing guide. + +### Backend Tests (Go) + +- **Unit tests** (`tests/unit/`): Isolated component logic +- **Contract tests** (`tests/contract/`): API contract validation +- **Integration tests** (`tests/integration/`): End-to-end with real k8s cluster + - Requires `TEST_NAMESPACE` environment variable + - Set `CLEANUP_RESOURCES=true` for automatic cleanup + - Permission tests validate RBAC boundaries + +### Frontend Tests (NextJS) + +- Jest for component testing (when configured) +- Cypress for e2e testing (see E2E Tests section above) + +### Operator Tests (Go) + +- Controller reconciliation logic tests +- CRD validation tests + +## Documentation Structure + +The MkDocs site (`mkdocs.yml`) provides: + +- **User Guide**: Getting started, RFE creation, agent framework, configuration +- **Developer Guide**: Setup, architecture, plugin development, API reference, testing +- **Labs**: Hands-on exercises (basic β†’ advanced β†’ production) + - Basic: First RFE, agent interaction, workflow basics + - Advanced: Custom agents, workflow modification, integration testing + - Production: Jira integration, OpenShift deployment, scaling +- **Reference**: Agent personas, API endpoints, configuration schema, glossary + +### Documentation Standards + +**Default to improving existing documentation** rather than creating new files. When adding or updating documentation (standalone files like `.md`, design docs, guides): + +- **Prefer inline updates**: Improve existing markdown files or code comments +- **Colocate new docs**: When feasible, documentation should live in the subdirectory that has the relevant code (e.g., `components/backend/README.md`) not at the top level +- **Avoid top-level proliferation**: Only create top-level docs for cross-cutting concerns (architecture, security, deployment) +- **Follow established patterns**: See `docs/amber-quickstart.md` and `components/backend/README.md` for examples of well-organized documentation + +### Director Training Labs + +Special lab track for leadership training located in `docs/labs/director-training/`: + +- Structured exercises for understanding the vTeam system from a strategic perspective +- Validation reports for tracking completion and understanding + +## Production Considerations + +### Security + +- **API keys**: Store in Kubernetes Secrets, managed via ProjectSettings CR +- **RBAC**: Namespace-scoped isolation prevents cross-project access +- **OAuth integration**: OpenShift OAuth for cluster-based authentication (see `docs/OPENSHIFT_OAUTH.md`) +- **Network policies**: Component isolation and secure communication + +### Monitoring + +- **Health endpoints**: `/health` on backend API +- **Logs**: Structured logging with OpenShift integration +- **Metrics**: Prometheus-compatible (when configured) +- **Events**: Kubernetes events for operator actions + +### Scaling + +- **Horizontal Pod Autoscaling**: Configure based on CPU/memory +- **Job concurrency**: Operator manages concurrent session execution +- **Resource limits**: Set appropriate requests/limits per component +- **Multi-tenancy**: Project-based isolation with shared infrastructure + +--- + +## Frontend Development Standards + +**See `components/frontend/DESIGN_GUIDELINES.md` for complete frontend development patterns.** + +### Critical Rules (Quick Reference) + +1. **Zero `any` Types** - Use proper types, `unknown`, or generic constraints +2. **Shadcn UI Components Only** - Use `@/components/ui/*` components, no custom UI from scratch +3. **React Query for ALL Data Operations** - Use hooks from `@/services/queries/*`, no manual `fetch()` +4. **Use `type` over `interface`** - Always prefer `type` for type definitions +5. **Colocate Single-Use Components** - Keep page-specific components with their pages + +### Pre-Commit Checklist for Frontend + +Before committing frontend code: + +- [ ] Zero `any` types (or justified with eslint-disable) +- [ ] All UI uses Shadcn components +- [ ] All data operations use React Query +- [ ] Components under 200 lines +- [ ] Single-use components colocated with their pages +- [ ] All buttons have loading states +- [ ] All lists have empty states +- [ ] All nested pages have breadcrumbs +- [ ] All routes have loading.tsx, error.tsx +- [ ] `npm run build` passes with 0 errors, 0 warnings +- [ ] All types use `type` instead of `interface` + +### Reference Files + +- `components/frontend/DESIGN_GUIDELINES.md` - Detailed patterns and examples +- `components/frontend/COMPONENT_PATTERNS.md` - Architecture patterns +- `components/frontend/src/components/ui/` - Available Shadcn components +- `components/frontend/src/services/` - API service layer examples diff --git a/AMBER_SETUP.md b/AMBER_SETUP.md index 1d3b333c3..3550c5d27 100644 --- a/AMBER_SETUP.md +++ b/AMBER_SETUP.md @@ -83,7 +83,7 @@ always_create_branch: true # Safe workflow - Security considerations - Troubleshooting -4. **CLAUDE.md** - Updated with Amber section +4. **AGENTS.md** - Updated with Amber section - Quick links to documentation - Common workflows summary diff --git a/CLAUDE.md b/CLAUDE.md deleted file mode 100644 index 5f8d5fb47..000000000 --- a/CLAUDE.md +++ /dev/null @@ -1,1062 +0,0 @@ -# CLAUDE.md - -This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. - -## Project Overview - -The **Ambient Code Platform** is a Kubernetes-native AI automation platform that orchestrates intelligent agentic sessions through containerized microservices. The platform enables AI-powered automation for analysis, research, development, and content creation tasks via a modern web interface. - -> **Note:** This project was formerly known as "vTeam". Technical artifacts (image names, namespaces, API groups) still use "vteam" for backward compatibility. - -### Amber Background Agent - -The platform includes **Amber**, a background agent that automates common development tasks via GitHub Issues. Team members can trigger automated fixes, refactoring, and test additions without requiring direct access to Claude Code. - -**Quick Links**: - -- [Amber Quickstart](docs/amber-quickstart.md) - Get started in 5 minutes -- [Full Documentation](docs/amber-automation.md) - Complete automation guide -- [Amber Config](.claude/amber-config.yml) - Automation policies - -**Common Workflows**: - -- πŸ€– **Auto-Fix** (label: `amber:auto-fix`): Formatting, linting, trivial fixes -- πŸ”§ **Refactoring** (label: `amber:refactor`): Break large files, extract patterns -- πŸ§ͺ **Test Coverage** (label: `amber:test-coverage`): Add missing tests - -### Core Architecture - -The system follows a Kubernetes-native pattern with Custom Resources, Operators, and Job execution: - -1. **Frontend** (NextJS + Shadcn): Web UI for session management and monitoring -2. **Backend API** (Go + Gin): REST API managing Kubernetes Custom Resources with multi-tenant project isolation -3. **Agentic Operator** (Go): Kubernetes controller watching CRs and creating Jobs -4. **Claude Code Runner** (Python): Job pods executing Claude Code CLI with multi-agent collaboration - -### Agentic Session Flow - -``` -User Creates Session β†’ Backend Creates CR β†’ Operator Spawns Job β†’ -Pod Runs Claude CLI β†’ Results Stored in CR β†’ UI Displays Progress -``` - -## Memory System - Loadable Context - -This repository uses a structured **memory system** to provide targeted, loadable context instead of relying solely on this comprehensive CLAUDE.md file. - -### Quick Reference - -**Load these files when working in specific areas:** - -| Task Type | Context File | Architecture View | Pattern File | -|-----------|--------------|-------------------|--------------| -| **Backend API work** | `.claude/context/backend-development.md` | `repomix-analysis/03-architecture-only.xml` | `.claude/patterns/k8s-client-usage.md` | -| **Frontend UI work** | `.claude/context/frontend-development.md` | `repomix-analysis/03-architecture-only.xml` | `.claude/patterns/react-query-usage.md` | -| **Security review** | `.claude/context/security-standards.md` | `repomix-analysis/03-architecture-only.xml` | `.claude/patterns/error-handling.md` | -| **Architecture questions** | - | `repomix-analysis/03-architecture-only.xml` | See ADRs below | - -**Note:** We use a single repomix architecture view (grade 8.8/10, 187K tokens) for all tasks. See `.claude/repomix-guide.md` for details. - -### Available Memory Files - -**1. Context Files** (`.claude/context/`) - -- `backend-development.md` - Go backend, K8s integration, handler patterns -- `frontend-development.md` - NextJS, Shadcn UI, React Query patterns -- `security-standards.md` - Auth, RBAC, token handling, security patterns - -**2. Architectural Decision Records** (`docs/adr/`) - -- Documents WHY decisions were made, not just WHAT -- `0001-kubernetes-native-architecture.md` -- `0002-user-token-authentication.md` -- `0003-multi-repo-support.md` -- `0004-go-backend-python-runner.md` -- `0005-nextjs-shadcn-react-query.md` - -**3. Code Pattern Catalog** (`.claude/patterns/`) - -- `error-handling.md` - Consistent error patterns (backend, operator, runner) -- `k8s-client-usage.md` - When to use user token vs. service account -- `react-query-usage.md` - Data fetching patterns (queries, mutations, caching) - -**4. Repomix Usage Guide** (`.claude/repomix-guide.md`) - -- Guide for using the architecture view effectively -- Why we use a single view approach (vs. 7 views) - -**5. Decision Log** (`docs/decisions.md`) - -- Lightweight chronological record of major decisions -- Links to ADRs, code, and context files - -### Example Usage - -``` -"Claude, load the architecture view (repomix-analysis/03-architecture-only.xml) and the -backend-development context file, then help me add a new endpoint for listing RFE workflows." -``` - -``` -"Claude, load the architecture view and security-standards context file, -then review this PR for token handling issues." -``` - -``` -"Claude, check ADR-0002 (User Token Authentication) and explain why we use user tokens -instead of service accounts for API operations." -``` - -## Development Commands - -### Quick Start - Local Development - -**Single command setup with OpenShift Local (CRC):** - -```bash -# Prerequisites: brew install crc -# Get free Red Hat pull secret from console.redhat.com/openshift/create/local -make dev-start - -# Access at https://vteam-frontend-vteam-dev.apps-crc.testing -``` - -**Hot-reloading development:** - -```bash -# Terminal 1 -DEV_MODE=true make dev-start - -# Terminal 2 (separate terminal) -make dev-sync -``` - -### Building Components - -```bash -# Build all container images (default: docker, linux/amd64) -make build-all - -# Build with podman -make build-all CONTAINER_ENGINE=podman - -# Build for ARM64 -make build-all PLATFORM=linux/arm64 - -# Build individual components -make build-frontend -make build-backend -make build-operator -make build-runner - -# Push to registry -make push-all REGISTRY=quay.io/your-username -``` - -### Deployment - -```bash -# Deploy with default images from quay.io/ambient_code -make deploy - -# Deploy to custom namespace -make deploy NAMESPACE=my-namespace - -# Deploy with custom images -cd components/manifests -cp env.example .env -# Edit .env with ANTHROPIC_API_KEY and CONTAINER_REGISTRY -./deploy.sh - -# Clean up deployment -make clean -``` - -### Component Development - -See component-specific documentation for detailed development commands: - -- **Backend** (`components/backend/README.md`): Go API development, testing, linting -- **Frontend** (`components/frontend/README.md`): NextJS development, see also `DESIGN_GUIDELINES.md` -- **Operator** (`components/operator/README.md`): Operator development, watch patterns -- **Claude Code Runner** (`components/runners/claude-code-runner/README.md`): Python runner development - -**Common commands**: - -```bash -make build-all # Build all components -make deploy # Deploy to cluster -make test # Run tests -make lint # Lint code -``` - -### Documentation - -```bash -# Install documentation dependencies -pip install -r requirements-docs.txt - -# Serve locally at http://127.0.0.1:8000 -mkdocs serve - -# Build static site -mkdocs build - -# Deploy to GitHub Pages -mkdocs gh-deploy - -# Markdown linting -markdownlint docs/**/*.md -``` - -### Local Development Helpers - -```bash -# View logs -make dev-logs # Both backend and frontend -make dev-logs-backend # Backend only -make dev-logs-frontend # Frontend only -make dev-logs-operator # Operator only - -# Operator management -make dev-restart-operator # Restart operator deployment -make dev-operator-status # Show operator status and events - -# Cleanup -make dev-stop # Stop processes, keep CRC running -make dev-stop-cluster # Stop processes and shutdown CRC -make dev-clean # Stop and delete OpenShift project - -# Testing -make dev-test # Run smoke tests -make dev-test-operator # Test operator only -``` - -## Key Architecture Patterns - -### Custom Resource Definitions (CRDs) - -The platform defines three primary CRDs: - -1. **AgenticSession** (`agenticsessions.vteam.ambient-code`): Represents an AI execution session - - Spec: prompt, repos (multi-repo support), interactive mode, timeout, model selection - - Status: phase, startTime, completionTime, results, error messages, per-repo push status - -2. **ProjectSettings** (`projectsettings.vteam.ambient-code`): Project-scoped configuration - - Manages API keys, default models, timeout settings - - Namespace-isolated for multi-tenancy - -3. **RFEWorkflow** (`rfeworkflows.vteam.ambient-code`): RFE (Request For Enhancement) workflows - - 7-step agent council process for engineering refinement - - Agent roles: PM, Architect, Staff Engineer, PO, Team Lead, Team Member, Delivery Owner - -### Multi-Repo Support - -AgenticSessions support operating on multiple repositories simultaneously: - -- Each repo has required `input` (URL, branch) and optional `output` (fork/target) configuration -- `mainRepoIndex` specifies which repo is the Claude working directory (default: 0) -- Per-repo status tracking: `pushed` or `abandoned` - -### Interactive vs Batch Mode - -- **Batch Mode** (default): Single prompt execution with timeout -- **Interactive Mode** (`interactive: true`): Long-running chat sessions using inbox/outbox files - -### Backend API Structure - -The Go backend (`components/backend/`) implements: - -- **Project-scoped endpoints**: `/api/projects/:project/*` for namespaced resources -- **Multi-tenant isolation**: Each project maps to a Kubernetes namespace -- **WebSocket support**: Real-time session updates via `websocket_messaging.go` -- **Git operations**: Repository cloning, forking, PR creation via `git.go` -- **RBAC integration**: OpenShift OAuth for authentication - -Main handler logic in `handlers.go` (3906 lines) manages: - -- Project CRUD operations -- AgenticSession lifecycle -- ProjectSettings management -- RFE workflow orchestration - -### Operator Reconciliation Loop - -The Kubernetes operator (`components/operator/`) watches for: - -- AgenticSession creation/updates β†’ spawns Jobs with runner pods -- Job completion β†’ updates CR status with results -- Timeout handling and cleanup - -### Runner Execution - -The Claude Code runner (`components/runners/claude-code-runner/`) provides: - -- Claude Code SDK integration (`claude-code-sdk>=0.0.23`) -- Workspace synchronization via PVC proxy -- Multi-agent collaboration capabilities -- Anthropic API streaming (`anthropic>=0.68.0`) - -## Configuration Standards - -### Python - -- **Virtual environments**: Always use `python -m venv venv` or `uv venv` -- **Package manager**: Prefer `uv` over `pip` -- **Formatting**: black (double quotes) -- **Import sorting**: isort with black profile -- **Linting**: flake8 (ignore E203, W503) - -### Go - -- **Formatting**: `go fmt ./...` (enforced) -- **Linting**: golangci-lint (install via `make install-tools`) -- **Testing**: Table-driven tests with subtests -- **Error handling**: Explicit error returns, no panic in production code - -### Container Images - -- **Default registry**: `quay.io/ambient_code` -- **Image tags**: Component-specific (vteam_frontend, vteam_backend, vteam_operator, vteam_claude_runner) -- **Platform**: Default `linux/amd64`, ARM64 supported via `PLATFORM=linux/arm64` -- **Build tool**: Docker or Podman (`CONTAINER_ENGINE=podman`) - -### Git Workflow - -- **Default branch**: `main` -- **Feature branches**: Required for development -- **Commit style**: Conventional commits (squashed on merge) -- **Branch verification**: Always check current branch before file modifications - -### Kubernetes/OpenShift - -- **Default namespace**: `ambient-code` (production), `vteam-dev` (local dev) -- **CRD group**: `vteam.ambient-code` -- **API version**: `v1alpha1` (current) -- **RBAC**: Namespace-scoped service accounts with minimal permissions - -## Backend and Operator Development Standards - -**IMPORTANT**: When working on backend (`components/backend/`) or operator (`components/operator/`) code, you MUST follow these strict guidelines based on established patterns in the codebase. - -### Critical Rules (Never Violate) - -1. **User Token Authentication Required** - - FORBIDDEN: Using backend service account for user-initiated API operations - - REQUIRED: Always use `GetK8sClientsForRequest(c)` to get user-scoped K8s clients - - REQUIRED: Return `401 Unauthorized` if user token is missing or invalid - - Exception: Backend service account ONLY for CR writes and token minting (handlers/sessions.go:227, handlers/sessions.go:449) - -2. **Never Panic in Production Code** - - FORBIDDEN: `panic()` in handlers, reconcilers, or any production path - - REQUIRED: Return explicit errors with context: `return fmt.Errorf("failed to X: %w", err)` - - REQUIRED: Log errors before returning: `log.Printf("Operation failed: %v", err)` - -3. **Token Security and Redaction** - - FORBIDDEN: Logging tokens, API keys, or sensitive headers - - REQUIRED: Redact tokens in logs using custom formatters (server/server.go:22-34) - - REQUIRED: Use `log.Printf("tokenLen=%d", len(token))` instead of logging token content - - Example: `path = strings.Split(path, "?")[0] + "?token=[REDACTED]"` - -4. **Type-Safe Unstructured Access** - - FORBIDDEN: Direct type assertions without checking: `obj.Object["spec"].(map[string]interface{})` - - REQUIRED: Use `unstructured.Nested*` helpers with three-value returns - - Example: `spec, found, err := unstructured.NestedMap(obj.Object, "spec")` - - REQUIRED: Check `found` before using values; handle type mismatches gracefully - -5. **OwnerReferences for Resource Lifecycle** - - REQUIRED: Set OwnerReferences on all child resources (Jobs, Secrets, PVCs, Services) - - REQUIRED: Use `Controller: boolPtr(true)` for primary owner - - FORBIDDEN: `BlockOwnerDeletion` (causes permission issues in multi-tenant environments) - - Pattern: (operator/internal/handlers/sessions.go:125-134, handlers/sessions.go:470-476) - -### Package Organization - -**Backend Structure** (`components/backend/`): - -``` -backend/ -β”œβ”€β”€ handlers/ # HTTP handlers grouped by resource -β”‚ β”œβ”€β”€ sessions.go # AgenticSession CRUD + lifecycle -β”‚ β”œβ”€β”€ projects.go # Project management -β”‚ β”œβ”€β”€ rfe.go # RFE workflows -β”‚ β”œβ”€β”€ helpers.go # Shared utilities (StringPtr, etc.) -β”‚ └── middleware.go # Auth, validation, RBAC -β”œβ”€β”€ types/ # Type definitions (no business logic) -β”‚ β”œβ”€β”€ session.go -β”‚ β”œβ”€β”€ project.go -β”‚ └── common.go -β”œβ”€β”€ server/ # Server setup, CORS, middleware -β”œβ”€β”€ k8s/ # K8s resource templates -β”œβ”€β”€ git/, github/ # External integrations -β”œβ”€β”€ websocket/ # Real-time messaging -β”œβ”€β”€ routes.go # HTTP route registration -└── main.go # Wiring, dependency injection -``` - -**Operator Structure** (`components/operator/`): - -``` -operator/ -β”œβ”€β”€ internal/ -β”‚ β”œβ”€β”€ config/ # K8s client init, config loading -β”‚ β”œβ”€β”€ types/ # GVR definitions, resource helpers -β”‚ β”œβ”€β”€ handlers/ # Watch handlers (sessions, namespaces, projectsettings) -β”‚ └── services/ # Reusable services (PVC provisioning, etc.) -└── main.go # Watch coordination -``` - -**Rules**: - -- Handlers contain HTTP/watch logic ONLY -- Types are pure data structures -- Business logic in separate service packages -- No cyclic dependencies between packages - -### Kubernetes Client Patterns - -**User-Scoped Clients** (for API operations): - -```go -// ALWAYS use for user-initiated operations (list, get, create, update, delete) -reqK8s, reqDyn := GetK8sClientsForRequest(c) -if reqK8s == nil { - c.JSON(http.StatusUnauthorized, gin.H{"error": "Invalid or missing token"}) - c.Abort() - return -} -// Use reqDyn for CR operations in user's authorized namespaces -list, err := reqDyn.Resource(gvr).Namespace(project).List(ctx, v1.ListOptions{}) -``` - -**Backend Service Account Clients** (limited use cases): - -```go -// ONLY use for: -// 1. Writing CRs after validation (handlers/sessions.go:417) -// 2. Minting tokens/secrets for runners (handlers/sessions.go:449) -// 3. Cross-namespace operations backend is authorized for -// Available as: DynamicClient, K8sClient (package-level in handlers/) -created, err := DynamicClient.Resource(gvr).Namespace(project).Create(ctx, obj, v1.CreateOptions{}) -``` - -**Never**: - -- ❌ Fall back to service account when user token is invalid -- ❌ Use service account for list/get operations on behalf of users -- ❌ Skip RBAC checks by using elevated permissions - -### Error Handling Patterns - -**Handler Errors**: - -```go -// Pattern 1: Resource not found -if errors.IsNotFound(err) { - c.JSON(http.StatusNotFound, gin.H{"error": "Session not found"}) - return -} - -// Pattern 2: Log + return error -if err != nil { - log.Printf("Failed to create session %s in project %s: %v", name, project, err) - c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to create session"}) - return -} - -// Pattern 3: Non-fatal errors (continue operation) -if err := updateStatus(...); err != nil { - log.Printf("Warning: status update failed: %v", err) - // Continue - session was created successfully -} -``` - -**Operator Errors**: - -```go -// Pattern 1: Resource deleted during processing (non-fatal) -if errors.IsNotFound(err) { - log.Printf("AgenticSession %s no longer exists, skipping", name) - return nil // Don't treat as error -} - -// Pattern 2: Retriable errors in watch loop -if err != nil { - log.Printf("Failed to create job: %v", err) - updateAgenticSessionStatus(ns, name, map[string]interface{}{ - "phase": "Error", - "message": fmt.Sprintf("Failed to create job: %v", err), - }) - return fmt.Errorf("failed to create job: %v", err) -} -``` - -**Never**: - -- ❌ Silent failures (always log errors) -- ❌ Generic error messages ("operation failed") -- ❌ Retrying indefinitely without backoff - -### Resource Management - -**OwnerReferences Pattern**: - -```go -// Always set owner when creating child resources -ownerRef := v1.OwnerReference{ - APIVersion: obj.GetAPIVersion(), // e.g., "vteam.ambient-code/v1alpha1" - Kind: obj.GetKind(), // e.g., "AgenticSession" - Name: obj.GetName(), - UID: obj.GetUID(), - Controller: boolPtr(true), // Only one controller per resource - // BlockOwnerDeletion: intentionally omitted (permission issues) -} - -// Apply to child resources -job := &batchv1.Job{ - ObjectMeta: v1.ObjectMeta{ - Name: jobName, - Namespace: namespace, - OwnerReferences: []v1.OwnerReference{ownerRef}, - }, - // ... -} -``` - -**Cleanup Patterns**: - -```go -// Rely on OwnerReferences for automatic cleanup, but delete explicitly when needed -policy := v1.DeletePropagationBackground -err := K8sClient.BatchV1().Jobs(ns).Delete(ctx, jobName, v1.DeleteOptions{ - PropagationPolicy: &policy, -}) -if err != nil && !errors.IsNotFound(err) { - log.Printf("Failed to delete job: %v", err) - return err -} -``` - -### Security Patterns - -**Token Handling**: - -```go -// Extract token from Authorization header -rawAuth := c.GetHeader("Authorization") -parts := strings.SplitN(rawAuth, " ", 2) -if len(parts) != 2 || !strings.EqualFold(parts[0], "Bearer") { - c.JSON(http.StatusUnauthorized, gin.H{"error": "invalid Authorization header"}) - return -} -token := strings.TrimSpace(parts[1]) - -// NEVER log the token itself -log.Printf("Processing request with token (len=%d)", len(token)) -``` - -**RBAC Enforcement**: - -```go -// Always check permissions before operations -ssar := &authv1.SelfSubjectAccessReview{ - Spec: authv1.SelfSubjectAccessReviewSpec{ - ResourceAttributes: &authv1.ResourceAttributes{ - Group: "vteam.ambient-code", - Resource: "agenticsessions", - Verb: "list", - Namespace: project, - }, - }, -} -res, err := reqK8s.AuthorizationV1().SelfSubjectAccessReviews().Create(ctx, ssar, v1.CreateOptions{}) -if err != nil || !res.Status.Allowed { - c.JSON(http.StatusForbidden, gin.H{"error": "Unauthorized"}) - return -} -``` - -**Container Security**: - -```go -// Always set SecurityContext for Job pods -SecurityContext: &corev1.SecurityContext{ - AllowPrivilegeEscalation: boolPtr(false), - ReadOnlyRootFilesystem: boolPtr(false), // Only if temp files needed - Capabilities: &corev1.Capabilities{ - Drop: []corev1.Capability{"ALL"}, // Drop all by default - }, -}, -``` - -### API Design Patterns - -**Project-Scoped Endpoints**: - -```go -// Standard pattern: /api/projects/:projectName/resource -r.GET("/api/projects/:projectName/agentic-sessions", ValidateProjectContext(), ListSessions) -r.POST("/api/projects/:projectName/agentic-sessions", ValidateProjectContext(), CreateSession) -r.GET("/api/projects/:projectName/agentic-sessions/:sessionName", ValidateProjectContext(), GetSession) - -// ValidateProjectContext middleware: -// 1. Extracts project from route param -// 2. Validates user has access via RBAC check -// 3. Sets project in context: c.Set("project", projectName) -``` - -**Middleware Chain**: - -```go -// Order matters: Recovery β†’ Logging β†’ CORS β†’ Identity β†’ Validation β†’ Handler -r.Use(gin.Recovery()) -r.Use(gin.LoggerWithFormatter(customRedactingFormatter)) -r.Use(cors.New(corsConfig)) -r.Use(forwardedIdentityMiddleware()) // Extracts X-Forwarded-User, etc. -r.Use(ValidateProjectContext()) // RBAC check -``` - -**Response Patterns**: - -```go -// Success with data -c.JSON(http.StatusOK, gin.H{"items": sessions}) - -// Success with created resource -c.JSON(http.StatusCreated, gin.H{"message": "Session created", "name": name, "uid": uid}) - -// Success with no content -c.Status(http.StatusNoContent) - -// Errors with structured messages -c.JSON(http.StatusBadRequest, gin.H{"error": "Invalid request"}) -``` - -### Operator Patterns - -**Watch Loop with Reconnection**: - -```go -func WatchAgenticSessions() { - gvr := types.GetAgenticSessionResource() - - for { // Infinite loop with reconnection - watcher, err := config.DynamicClient.Resource(gvr).Watch(ctx, v1.ListOptions{}) - if err != nil { - log.Printf("Failed to create watcher: %v", err) - time.Sleep(5 * time.Second) // Backoff before retry - continue - } - - log.Println("Watching for events...") - - for event := range watcher.ResultChan() { - switch event.Type { - case watch.Added, watch.Modified: - obj := event.Object.(*unstructured.Unstructured) - handleEvent(obj) - case watch.Deleted: - // Handle cleanup - } - } - - log.Println("Watch channel closed, restarting...") - watcher.Stop() - time.Sleep(2 * time.Second) - } -} -``` - -**Reconciliation Pattern**: - -```go -func handleEvent(obj *unstructured.Unstructured) error { - name := obj.GetName() - namespace := obj.GetNamespace() - - // 1. Verify resource still exists (avoid race conditions) - currentObj, err := getDynamicClient().Get(ctx, name, namespace) - if errors.IsNotFound(err) { - log.Printf("Resource %s no longer exists, skipping", name) - return nil // Not an error - } - - // 2. Get current phase/status - status, found, _ := unstructured.NestedMap(currentObj.Object, "status") - phase := getPhaseOrDefault(status, "Pending") - - // 3. Only reconcile if in expected state - if phase != "Pending" { - return nil // Already processed - } - - // 4. Create resources idempotently (check existence first) - if _, err := getResource(name); err == nil { - log.Printf("Resource %s already exists", name) - return nil - } - - // 5. Create and update status - createResource(...) - updateStatus(namespace, name, map[string]interface{}{"phase": "Creating"}) - - return nil -} -``` - -**Status Updates** (use UpdateStatus subresource): - -```go -func updateAgenticSessionStatus(namespace, name string, updates map[string]interface{}) error { - gvr := types.GetAgenticSessionResource() - - obj, err := config.DynamicClient.Resource(gvr).Namespace(namespace).Get(ctx, name, v1.GetOptions{}) - if errors.IsNotFound(err) { - log.Printf("Resource deleted, skipping status update") - return nil // Not an error - } - - if obj.Object["status"] == nil { - obj.Object["status"] = make(map[string]interface{}) - } - - status := obj.Object["status"].(map[string]interface{}) - for k, v := range updates { - status[k] = v - } - - // Use UpdateStatus subresource (requires /status permission) - _, err = config.DynamicClient.Resource(gvr).Namespace(namespace).UpdateStatus(ctx, obj, v1.UpdateOptions{}) - if errors.IsNotFound(err) { - return nil // Resource deleted during update - } - return err -} -``` - -**Goroutine Monitoring**: - -```go -// Start background monitoring (operator/internal/handlers/sessions.go:477) -go monitorJob(jobName, sessionName, namespace) - -// Monitoring loop checks both K8s Job status AND custom container status -func monitorJob(jobName, sessionName, namespace string) { - for { - time.Sleep(5 * time.Second) - - // 1. Check if parent resource still exists (exit if deleted) - if _, err := getSession(namespace, sessionName); errors.IsNotFound(err) { - log.Printf("Session deleted, stopping monitoring") - return - } - - // 2. Check Job status - job, err := K8sClient.BatchV1().Jobs(namespace).Get(ctx, jobName, v1.GetOptions{}) - if errors.IsNotFound(err) { - return - } - - // 3. Update status based on Job conditions - if job.Status.Succeeded > 0 { - updateStatus(namespace, sessionName, map[string]interface{}{ - "phase": "Completed", - "completionTime": time.Now().Format(time.RFC3339), - }) - cleanup(namespace, jobName) - return - } - } -} -``` - -### Pre-Commit Checklist for Backend/Operator - -Before committing backend or operator code, verify: - -- [ ] **Authentication**: All user-facing endpoints use `GetK8sClientsForRequest(c)` -- [ ] **Authorization**: RBAC checks performed before resource access -- [ ] **Error Handling**: All errors logged with context, appropriate HTTP status codes -- [ ] **Token Security**: No tokens or sensitive data in logs -- [ ] **Type Safety**: Used `unstructured.Nested*` helpers, checked `found` before using values -- [ ] **Resource Cleanup**: OwnerReferences set on all child resources -- [ ] **Status Updates**: Used `UpdateStatus` subresource, handled IsNotFound gracefully -- [ ] **Tests**: Added/updated tests for new functionality -- [ ] **Logging**: Structured logs with relevant context (namespace, resource name, etc.) -- [ ] **Code Quality**: Ran all linting checks locally (see below) - -**Run these commands before committing:** - -```bash -# Backend -cd components/backend -gofmt -l . # Check formatting (should output nothing) -go vet ./... # Detect suspicious constructs -golangci-lint run # Run comprehensive linting - -# Operator -cd components/operator -gofmt -l . -go vet ./... -golangci-lint run -``` - -**Auto-format code:** - -```bash -gofmt -w components/backend components/operator -``` - -**Note**: GitHub Actions will automatically run these checks on your PR. Fix any issues locally before pushing. - -### Common Mistakes to Avoid - -**Backend**: - -- ❌ Using service account client for user operations (always use user token) -- ❌ Not checking if user-scoped client creation succeeded -- ❌ Logging full token values (use `len(token)` instead) -- ❌ Not validating project access in middleware -- ❌ Type assertions without checking: `val := obj["key"].(string)` (use `val, ok := ...`) -- ❌ Not setting OwnerReferences (causes resource leaks) -- ❌ Treating IsNotFound as fatal error during cleanup -- ❌ Exposing internal error details to API responses (use generic messages) - -**Operator**: - -- ❌ Not reconnecting watch on channel close -- ❌ Processing events without verifying resource still exists -- ❌ Updating status on main object instead of /status subresource -- ❌ Not checking current phase before reconciliation (causes duplicate resources) -- ❌ Creating resources without idempotency checks -- ❌ Goroutine leaks (not exiting monitor when resource deleted) -- ❌ Using `panic()` in watch/reconciliation loops -- ❌ Not setting SecurityContext on Job pods - -### Reference Files - -Study these files to understand established patterns: - -**Backend**: - -- `components/backend/handlers/sessions.go` - Complete session lifecycle, user/SA client usage -- `components/backend/handlers/middleware.go` - Auth patterns, token extraction, RBAC -- `components/backend/handlers/helpers.go` - Utility functions (StringPtr, BoolPtr) -- `components/backend/types/common.go` - Type definitions -- `components/backend/server/server.go` - Server setup, middleware chain, token redaction -- `components/backend/routes.go` - HTTP route definitions and registration - -**Operator**: - -- `components/operator/internal/handlers/sessions.go` - Watch loop, reconciliation, status updates -- `components/operator/internal/config/config.go` - K8s client initialization -- `components/operator/internal/types/resources.go` - GVR definitions -- `components/operator/internal/services/infrastructure.go` - Reusable services - -## GitHub Actions CI/CD - -### Component Build Pipeline (`.github/workflows/components-build-deploy.yml`) - -- **Change detection**: Only builds modified components (frontend, backend, operator, claude-runner) -- **Multi-platform builds**: linux/amd64 and linux/arm64 -- **Registry**: Pushes to `quay.io/ambient_code` on main branch -- **PR builds**: Build-only, no push on pull requests - -### Automation Workflows - -- **amber-issue-handler.yml**: Amber background agent - automated fixes via GitHub issue labels (`amber:auto-fix`, `amber:refactor`, `amber:test-coverage`) or `/amber execute` command -- **amber-dependency-sync.yml**: Daily sync of dependency versions to Amber agent knowledge base -- **claude.yml**: Claude Code integration - responds to `@claude` mentions in issues/PRs -- **claude-code-review.yml**: Automated code reviews on pull requests - -### Code Quality Workflows - -- **go-lint.yml**: Go code formatting, vetting, and linting (gofmt, go vet, golangci-lint) -- **frontend-lint.yml**: Frontend code quality (ESLint, TypeScript checking, build validation) - -### Deployment & Testing Workflows - -- **prod-release-deploy.yaml**: Production releases with semver versioning and changelog generation -- **e2e.yml**: End-to-end Cypress testing in kind cluster (see Testing Strategy section) -- **test-local-dev.yml**: Local development environment validation - -### Utility Workflows - -- **docs.yml**: Deploy MkDocs documentation to GitHub Pages -- **dependabot-auto-merge.yml**: Auto-approve and merge Dependabot dependency updates - -## Testing Strategy - -### E2E Tests (Cypress + Kind) - -**Purpose**: Automated end-to-end testing of the complete vTeam stack in a Kubernetes environment. - -**Location**: `e2e/` - -**Quick Start**: - -```bash -make e2e-test CONTAINER_ENGINE=podman # Or docker -``` - -**What Gets Tested**: - -- βœ… Full vTeam deployment in kind (Kubernetes in Docker) -- βœ… Frontend UI rendering and navigation -- βœ… Backend API connectivity -- βœ… Project creation workflow (main user journey) -- βœ… Authentication with ServiceAccount tokens -- βœ… Ingress routing -- βœ… All pods deploy and become ready - -**What Doesn't Get Tested**: - -- ❌ OAuth proxy flow (uses direct token auth for simplicity) -- ❌ Session pod execution (requires Anthropic API key) -- ❌ Multi-user scenarios - -**Test Suite** (`e2e/cypress/e2e/vteam.cy.ts`): - -1. UI loads with token authentication -2. Navigate to new project page -3. Create a new project -4. List created projects -5. Backend API cluster-info endpoint - -**CI Integration**: Tests run automatically on all PRs via GitHub Actions (`.github/workflows/e2e.yml`) - -**Key Implementation Details**: - -- **Architecture**: Frontend without oauth-proxy, direct token injection via environment variables -- **Authentication**: Test user ServiceAccount with cluster-admin permissions -- **Token Handling**: Frontend deployment includes `OC_TOKEN`, `OC_USER`, `OC_EMAIL` env vars -- **Podman Support**: Auto-detects runtime, uses ports 8080/8443 for rootless Podman -- **Ingress**: Standard nginx-ingress with path-based routing - -**Adding New Tests**: - -```typescript -it('should test new feature', () => { - cy.visit('/some-page') - cy.contains('Expected Content').should('be.visible') - cy.get('#button').click() - // Auth header automatically injected via beforeEach interceptor -}) -``` - -**Debugging Tests**: - -```bash -cd e2e -source .env.test -CYPRESS_TEST_TOKEN="$TEST_TOKEN" CYPRESS_BASE_URL="http://vteam.local:8080" npm run test:headed -``` - -**Documentation**: See `e2e/README.md` and `docs/testing/e2e-guide.md` for comprehensive testing guide. - -### Backend Tests (Go) - -- **Unit tests** (`tests/unit/`): Isolated component logic -- **Contract tests** (`tests/contract/`): API contract validation -- **Integration tests** (`tests/integration/`): End-to-end with real k8s cluster - - Requires `TEST_NAMESPACE` environment variable - - Set `CLEANUP_RESOURCES=true` for automatic cleanup - - Permission tests validate RBAC boundaries - -### Frontend Tests (NextJS) - -- Jest for component testing (when configured) -- Cypress for e2e testing (see E2E Tests section above) - -### Operator Tests (Go) - -- Controller reconciliation logic tests -- CRD validation tests - -## Documentation Structure - -The MkDocs site (`mkdocs.yml`) provides: - -- **User Guide**: Getting started, RFE creation, agent framework, configuration -- **Developer Guide**: Setup, architecture, plugin development, API reference, testing -- **Labs**: Hands-on exercises (basic β†’ advanced β†’ production) - - Basic: First RFE, agent interaction, workflow basics - - Advanced: Custom agents, workflow modification, integration testing - - Production: Jira integration, OpenShift deployment, scaling -- **Reference**: Agent personas, API endpoints, configuration schema, glossary - -### Documentation Standards - -**Default to improving existing documentation** rather than creating new files. When adding or updating documentation (standalone files like `.md`, design docs, guides): - -- **Prefer inline updates**: Improve existing markdown files or code comments -- **Colocate new docs**: When feasible, documentation should live in the subdirectory that has the relevant code (e.g., `components/backend/README.md`) not at the top level -- **Avoid top-level proliferation**: Only create top-level docs for cross-cutting concerns (architecture, security, deployment) -- **Follow established patterns**: See `docs/amber-quickstart.md` and `components/backend/README.md` for examples of well-organized documentation - -### Director Training Labs - -Special lab track for leadership training located in `docs/labs/director-training/`: - -- Structured exercises for understanding the vTeam system from a strategic perspective -- Validation reports for tracking completion and understanding - -## Production Considerations - -### Security - -- **API keys**: Store in Kubernetes Secrets, managed via ProjectSettings CR -- **RBAC**: Namespace-scoped isolation prevents cross-project access -- **OAuth integration**: OpenShift OAuth for cluster-based authentication (see `docs/OPENSHIFT_OAUTH.md`) -- **Network policies**: Component isolation and secure communication - -### Monitoring - -- **Health endpoints**: `/health` on backend API -- **Logs**: Structured logging with OpenShift integration -- **Metrics**: Prometheus-compatible (when configured) -- **Events**: Kubernetes events for operator actions - -### Scaling - -- **Horizontal Pod Autoscaling**: Configure based on CPU/memory -- **Job concurrency**: Operator manages concurrent session execution -- **Resource limits**: Set appropriate requests/limits per component -- **Multi-tenancy**: Project-based isolation with shared infrastructure - ---- - -## Frontend Development Standards - -**See `components/frontend/DESIGN_GUIDELINES.md` for complete frontend development patterns.** - -### Critical Rules (Quick Reference) - -1. **Zero `any` Types** - Use proper types, `unknown`, or generic constraints -2. **Shadcn UI Components Only** - Use `@/components/ui/*` components, no custom UI from scratch -3. **React Query for ALL Data Operations** - Use hooks from `@/services/queries/*`, no manual `fetch()` -4. **Use `type` over `interface`** - Always prefer `type` for type definitions -5. **Colocate Single-Use Components** - Keep page-specific components with their pages - -### Pre-Commit Checklist for Frontend - -Before committing frontend code: - -- [ ] Zero `any` types (or justified with eslint-disable) -- [ ] All UI uses Shadcn components -- [ ] All data operations use React Query -- [ ] Components under 200 lines -- [ ] Single-use components colocated with their pages -- [ ] All buttons have loading states -- [ ] All lists have empty states -- [ ] All nested pages have breadcrumbs -- [ ] All routes have loading.tsx, error.tsx -- [ ] `npm run build` passes with 0 errors, 0 warnings -- [ ] All types use `type` instead of `interface` - -### Reference Files - -- `components/frontend/DESIGN_GUIDELINES.md` - Detailed patterns and examples -- `components/frontend/COMPONENT_PATTERNS.md` - Architecture patterns -- `components/frontend/src/components/ui/` - Available Shadcn components -- `components/frontend/src/services/` - API service layer examples diff --git a/CLAUDE.md b/CLAUDE.md new file mode 120000 index 000000000..47dc3e3d8 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1 @@ +AGENTS.md \ No newline at end of file diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 142e9b13a..795a5ec22 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -226,7 +226,7 @@ go install github.com/golangci/golangci-lint/cmd/golangci-lint@latest - Use `unstructured.Nested*` helpers for type-safe CR access - Set OwnerReferences on child resources for automatic cleanup -See [CLAUDE.md](CLAUDE.md) for comprehensive backend/operator development standards. +See [AGENTS.md](AGENTS.md) for comprehensive backend/operator development standards. ### Frontend Code (NextJS) @@ -589,7 +589,7 @@ oc describe pvc backend-state-pvc -n vteam-dev If you're stuck or have questions: 1. **Check existing documentation:** - - [CLAUDE.md](CLAUDE.md) - Comprehensive development standards + - [AGENTS.md](AGENTS.md) - Comprehensive development standards - [README.md](README.md) - Project overview and quick start - [docs/](docs/) - Additional documentation diff --git a/README.md b/README.md index eeac5098d..d881af25f 100644 --- a/README.md +++ b/README.md @@ -543,7 +543,7 @@ See [e2e/README.md](e2e/README.md) for detailed documentation, troubleshooting, - Update relevant documentation when changing functionality - Follow existing documentation style (Markdown) - Add code comments for complex logic -- Update CLAUDE.md if adding new patterns or standards +- Update AGENTS.md if adding new patterns or standards ## Support & Documentation diff --git a/agents/amber.md b/agents/amber.md index 9d8873509..d8faebef8 100644 --- a/agents/amber.md +++ b/agents/amber.md @@ -28,7 +28,7 @@ You are Amber, the Ambient Code Platform's expert colleague and codebase intelli - When you identify a bug, include the fix **4. Team Fit** -- Respect project standards (CLAUDE.md, DESIGN_GUIDELINES.md) +- Respect project standards (AGENTS.md, DESIGN_GUIDELINES.md) - Learn from past decisions (git history, closed PRs, issue comments) - Adapt tone to context: terse in commits, detailed in RFCs - Make the team look goodβ€”your work enables theirs @@ -100,14 +100,14 @@ Low - Changes isolated to session handler, no API schema changes You operate within a clear authority hierarchy: 1. **Constitution** (`.specify/memory/constitution.md`) - ABSOLUTE authority, supersedes everything -2. **CLAUDE.md** - Project development standards, implements constitution +2. **AGENTS.md** - Project development standards, implements constitution 3. **Your Persona** (`agents/amber.md`) - Domain expertise within constitutional bounds 4. **User Instructions** - Task guidance, cannot override constitution **When Conflicts Arise:** - Constitution always wins - no exceptions - Politely decline requests that violate constitution, explain why -- CLAUDE.md preferences are negotiable with user approval +- AGENTS.md preferences are negotiable with user approval - Your expertise guides implementation within constitutional compliance ### Visual: Authority Hierarchy & Conflict Resolution @@ -117,14 +117,14 @@ flowchart TD Start([User Request]) --> CheckConst{Violates
Constitution?} CheckConst -->|YES| Decline[❌ Politely Decline
Explain principle violated
Suggest alternative] - CheckConst -->|NO| CheckCLAUDE{Conflicts with
CLAUDE.md?} + CheckConst -->|NO| CheckCLAUDE{Conflicts with
AGENTS.md?} CheckCLAUDE -->|YES| Warn[⚠️ Warn User
Explain preference
Ask confirmation] CheckCLAUDE -->|NO| CheckAgent{Within your
expertise?} Warn --> UserConfirm{User
Confirms?} UserConfirm -->|YES| Implement - UserConfirm -->|NO| UseStandard[Use CLAUDE.md standard] + UserConfirm -->|NO| UseStandard[Use AGENTS.md standard] CheckAgent -->|YES| Implement[βœ… Implement Request
Follow constitution
Apply expertise] CheckAgent -->|NO| Implement @@ -151,12 +151,12 @@ flowchart TD **Decision Flow:** 1. **Constitution Check** - FIRST and absolute -2. **CLAUDE.md Check** - Warn but negotiable +2. **AGENTS.md Check** - Warn but negotiable 3. **Implementation** - Apply expertise within bounds **Example Scenarios:** - Request: "Skip tests" β†’ Constitution violation β†’ Decline -- Request: "Use docker" β†’ CLAUDE.md preference (podman) β†’ Warn, ask confirmation +- Request: "Use docker" β†’ AGENTS.md preference (podman) β†’ Warn, ask confirmation - Request: "Add logging" β†’ No conflicts β†’ Implement with structured logging (constitution compliance) **Detailed Examples:** @@ -166,14 +166,14 @@ flowchart TD - "Use panic() for error handling" β†’ Constitution Principle III violation β†’ Decline: "panic() is forbidden in production code per Constitution Principle III. I'll use fmt.Errorf() with context instead." - "Don't worry about linting, just commit it" β†’ Constitution Principle X violation β†’ Decline: "Constitution Principle X requires running linters before commits (gofmt, golangci-lint). I can run them now - takes <30 seconds." -**CLAUDE.md Preferences (Warn, Ask Confirmation):** -- "Build the container with docker" β†’ CLAUDE.md prefers podman β†’ Warn: "⚠️ CLAUDE.md specifies podman over docker. Should I use podman instead, or proceed with docker?" -- "Create a new Docker Compose file" β†’ CLAUDE.md uses K8s/OpenShift β†’ Warn: "⚠️ This project uses Kubernetes manifests (see components/manifests/). Docker Compose isn't in the standard stack. Should I create K8s manifests instead?" +**AGENTS.md Preferences (Warn, Ask Confirmation):** +- "Build the container with docker" β†’ AGENTS.md prefers podman β†’ Warn: "⚠️ AGENTS.md specifies podman over docker. Should I use podman instead, or proceed with docker?" +- "Create a new Docker Compose file" β†’ AGENTS.md uses K8s/OpenShift β†’ Warn: "⚠️ This project uses Kubernetes manifests (see components/manifests/). Docker Compose isn't in the standard stack. Should I create K8s manifests instead?" - "Change the Docker image registry" β†’ Acceptable with justification β†’ Warn: "⚠️ Standard registry is quay.io/ambient_code. Changing this may affect CI/CD. Confirm you want to proceed?" **Within Expertise (Implement):** - "Add structured logging to this handler" β†’ No conflicts β†’ Implement with constitution compliance (Principle VI) -- "Refactor this reconciliation loop" β†’ No conflicts β†’ Implement following operator patterns from CLAUDE.md +- "Refactor this reconciliation loop" β†’ No conflicts β†’ Implement following operator patterns from AGENTS.md - "Review this PR for security issues" β†’ No conflicts β†’ Perform analysis using ACP security standards ## ACP Constitution Compliance @@ -501,7 +501,7 @@ Full details: [link] **Quality Standards:** - Run linters before any commit (gofmt, black, isort, prettier, markdownlint) - Zero tolerance for test failures -- Follow CLAUDE.md and DESIGN_GUIDELINES.md +- Follow AGENTS.md and DESIGN_GUIDELINES.md - Conventional commits, squash on merge - All PRs include issue reference (`Fixes #123`) @@ -579,7 +579,7 @@ Full details: [link] - Runner workspace sync delays (PVC provisioning) - Langfuse integration (missing env vars, network policies) -**Team Preferences (from CLAUDE.md):** +**Team Preferences (from AGENTS.md):** - Squash commits, always - Git feature branches, never commit to main - Python: uv over pip, virtual environments always diff --git a/components/backend/README.md b/components/backend/README.md index 25a9608e6..919fe2197 100644 --- a/components/backend/README.md +++ b/components/backend/README.md @@ -97,7 +97,7 @@ make check-env # Verify Go, kubectl, docker installed ## Architecture -See `CLAUDE.md` in project root for: +See `AGENTS.md` in project root for: - Critical development rules - Kubernetes client patterns - Error handling patterns diff --git a/components/operator/README.md b/components/operator/README.md index 8ce001d33..c62894f5e 100644 --- a/components/operator/README.md +++ b/components/operator/README.md @@ -93,7 +93,7 @@ operator/ ### Key Patterns -See `CLAUDE.md` in project root for: +See `AGENTS.md` in project root for: - Watch loop with reconnection - Reconciliation pattern - Status updates (UpdateStatus subresource) diff --git a/components/runners/claude-code-runner/pyproject.toml b/components/runners/claude-code-runner/pyproject.toml index 6278e3649..7bf7a456c 100644 --- a/components/runners/claude-code-runner/pyproject.toml +++ b/components/runners/claude-code-runner/pyproject.toml @@ -2,7 +2,6 @@ name = "claude-code-runner" version = "0.1.0" description = "Runner that streams via Claude Code SDK and syncs workspace via PVC proxy" -readme = "CLAUDE.md" requires-python = ">=3.11" authors = [ { name = "vTeam" } diff --git a/docs/amber-automation.md b/docs/amber-automation.md index 9ef6c416e..50fd34482 100644 --- a/docs/amber-automation.md +++ b/docs/amber-automation.md @@ -150,7 +150,7 @@ File: `path/to/another.py` ## Success Criteria - [ ] All linters pass - [ ] All tests pass -- [ ] Follows CLAUDE.md standards +- [ ] Follows AGENTS.md standards ``` **Key Fields**: @@ -166,7 +166,7 @@ Amber's behavior is controlled by: 1. **Workflow**: `.github/workflows/amber-issue-handler.yml` 2. **Config**: `.claude/amber-config.yml` (automation policies) -3. **Project Standards**: `CLAUDE.md` (Amber follows all project conventions) +3. **Project Standards**: `AGENTS.md` (Amber follows all project conventions) ### Risk Levels @@ -423,8 +423,8 @@ A: Depends on task complexity. Typical auto-fix: ~10K tokens ($0.03), refactorin ## Related Documentation -- [Amber Configuration](https://github.com/ambient-code/platform/blob/main/.claude/amber-config.yml) - Automation policies -- [Project Standards](https://github.com/ambient-code/platform/blob/main/CLAUDE.md) - Conventions Amber follows +- [Amber Configuration](.claude/amber-config.yml) - Automation policies +- [Project Standards](../AGENTS.md) - Conventions Amber follows - [GitHub Actions Security](https://docs.github.com/en/actions/security-for-github-actions) - Official security guide --- diff --git a/docs/amber-quickstart.md b/docs/amber-quickstart.md index be7117900..5a15655dc 100644 --- a/docs/amber-quickstart.md +++ b/docs/amber-quickstart.md @@ -140,7 +140,7 @@ Break into modules: ## Constraints - Maintain backward compatibility - All existing tests must pass -- Follow CLAUDE.md standards +- Follow AGENTS.md standards ## Priority P0 - Critical @@ -326,8 +326,8 @@ gh workflow enable amber-issue-handler.yml ## Full Documentation - [Complete Guide](amber-automation.md) - Detailed documentation -- [Amber Config](https://github.com/ambient-code/platform/blob/main/.claude/amber-config.yml) - Automation policies -- [Project Standards](https://github.com/ambient-code/platform/blob/main/CLAUDE.md) - Conventions Amber follows +- [Amber Config](.claude/amber-config.yml) - Automation policies +- [Project Standards](../AGENTS.md) - Conventions Amber follows --- diff --git a/docs/implementation-plans/amber-implementation.md b/docs/implementation-plans/amber-implementation.md index fbd797482..d5745afcc 100644 --- a/docs/implementation-plans/amber-implementation.md +++ b/docs/implementation-plans/amber-implementation.md @@ -81,28 +81,28 @@ Amber is ACP's expert AI colleague with multiple operating modes: | Layer | File | Scope | Authority | When It Applies | Conflict Resolution | |-------|------|-------|-----------|-----------------|---------------------| | **1. Constitution** | `.specify/memory/constitution.md` | All code, all agents, all work | **ABSOLUTE** - Supersedes everything | Always - non-negotiable | Constitution wins, no exceptions | -| **2. Project Guidance** | `CLAUDE.md` | Development commands, architecture patterns | **HIGH** - Project standards | Claude Code development sessions | Must align with constitution | +| **2. Project Guidance** | `AGENTS.md` | Development commands, architecture patterns | **HIGH** - Project standards | Claude Code development sessions | Must align with constitution | | **3. Agent Persona** | `agents/amber.md` (or other agent) | Domain expertise, personality, workflows | **MEDIUM** - Tactical implementation | When agent is invoked by user | Must follow #1 and #2 | | **4. User Instructions** | Session prompt, chat messages | Task-specific guidance | **VARIABLE** - Depends on compliance | Current session only | Cannot override #1, can override #2-3 if constitutional | **Key Principles:** -1. **Constitution is Law**: No agent, no user instruction, no CLAUDE.md rule can override the constitution. Ever. +1. **Constitution is Law**: No agent, no user instruction, no AGENTS.md rule can override the constitution. Ever. -2. **CLAUDE.md Implements Constitution**: Project guidance operationalizes constitutional principles for Claude Code (e.g., "run gofmt before commits" implements Principle III). +2. **AGENTS.md Implements Constitution**: Project guidance operationalizes constitutional principles for Claude Code (e.g., "run gofmt before commits" implements Principle III). -3. **Agents Enforce Both**: Amber and other agents MUST follow constitution + CLAUDE.md while providing domain expertise. +3. **Agents Enforce Both**: Amber and other agents MUST follow constitution + AGENTS.md while providing domain expertise. 4. **User Can't Break Rules**: If user asks Amber to violate constitution (e.g., "skip tests"), Amber politely declines and explains why. -5. **Multi-Agent Sessions**: When multiple agents collaborate, ALL follow the same hierarchy. Constitution > CLAUDE.md > individual agent persona. +5. **Multi-Agent Sessions**: When multiple agents collaborate, ALL follow the same hierarchy. Constitution > AGENTS.md > individual agent persona. **Example Scenarios:** | Scenario | User Asks | Amber's Response | Why | |----------|-----------|------------------|-----| | Constitutional violation | "Just commit without tests" | ❌ Declines: "Constitution Principle IV requires TDD. Let's write tests first." | Constitution supersedes user | -| CLAUDE.md preference | "Use docker instead of podman" | ⚠️ Warns: "CLAUDE.md prefers podman. Proceed with docker?" | Project standard, but negotiable | +| AGENTS.md preference | "Use docker instead of podman" | ⚠️ Warns: "AGENTS.md prefers podman. Proceed with docker?" | Project standard, but negotiable | | Agent expertise | "How should I structure this?" | βœ… Provides: Amber's ACP-specific architectural guidance | Agent domain knowledge | | User preference | "Use verbose logging here" | βœ… Implements: Adds detailed logs | User choice within constitutional bounds | @@ -279,14 +279,14 @@ Delete line: `- \`RFEWorkflow\` (rfeworkflows.vteam.ambient-code): Engineering r You operate within a clear authority hierarchy: 1. **Constitution** (`.specify/memory/constitution.md`) - ABSOLUTE authority, supersedes everything -2. **CLAUDE.md** - Project development standards, implements constitution +2. **AGENTS.md** - Project development standards, implements constitution 3. **Your Persona** (`agents/amber.md`) - Domain expertise within constitutional bounds 4. **User Instructions** - Task guidance, cannot override constitution **When Conflicts Arise:** - Constitution always wins - no exceptions - Politely decline requests that violate constitution, explain why -- CLAUDE.md preferences are negotiable with user approval +- AGENTS.md preferences are negotiable with user approval - Your expertise guides implementation within constitutional compliance ## ACP Constitution Compliance @@ -409,7 +409,7 @@ grep -c "Here's my plan\|I'm 90% confident\|To roll this back\|I investigated 3 **5. Authority Hierarchy & Conflict Resolution** - **Location:** `agents/amber.md` (in "Authority Hierarchy" section, after "When Conflicts Arise") - **Type:** Flowchart -- **Shows:** Decision tree for handling user requests (Constitution β†’ CLAUDE.md β†’ Implementation) +- **Shows:** Decision tree for handling user requests (Constitution β†’ AGENTS.md β†’ Implementation) - **Key Feature:** Color-coded paths (red=decline, yellow=warn, green=implement) ### Diagram Design Standards @@ -563,7 +563,7 @@ Amber operates within a clear hierarchy to ensure quality and compliance: | Priority | What | Authority | Notes | |----------|------|-----------|-------| | **1** | **ACP Constitution** | Absolute | Amber cannot violate constitution principles, even if you ask | -| **2** | **CLAUDE.md** | High | Project standards; negotiable with your approval | +| **2** | **AGENTS.md** | High | Project standards; negotiable with your approval | | **3** | **Amber's Expertise** | Medium | ACP-specific guidance within constitutional bounds | | **4** | **Your Instructions** | Variable | Must align with constitution and project standards | @@ -571,7 +571,7 @@ Amber operates within a clear hierarchy to ensure quality and compliance: βœ… **Amber will decline**: Requests that violate the constitution (e.g., "skip tests", "use panic()", "commit without linting") -⚠️ **Amber will warn**: Deviations from CLAUDE.md preferences (e.g., "docker instead of podman") but proceed if you confirm +⚠️ **Amber will warn**: Deviations from AGENTS.md preferences (e.g., "docker instead of podman") but proceed if you confirm βœ… **Amber will implement**: Your task requirements within constitutional and project compliance @@ -970,7 +970,7 @@ git show --stat HEAD ## Key Changes Summary **Governance & Hierarchy:** -- Clear authority model: Constitution > CLAUDE.md > Agent Persona > User Instructions +- Clear authority model: Constitution > AGENTS.md > Agent Persona > User Instructions - Embedded constitution compliance with daily validation - Auto-file issues on constitution violations (workflow continues) - User-facing documentation explains when Amber will decline requests diff --git a/docs/labs/basic/lab-1-first-rfe.md b/docs/labs/basic/lab-1-first-rfe.md index e74fe1920..7cb6aaaee 100644 --- a/docs/labs/basic/lab-1-first-rfe.md +++ b/docs/labs/basic/lab-1-first-rfe.md @@ -327,7 +327,7 @@ Ready to dig deeper? - **Experiment with timeouts**: Find optimal values for different task types - **Explore multi-repo workflows**: Cross-repository analysis and migration - **Customize ProjectSettings**: Configure default models, timeouts, and API keys -- **Review CLAUDE.md**: Understand the complete AgenticSession specification +- **Review AGENTS.md**: Understand the complete AgenticSession specification ## Success Criteria βœ… diff --git a/docs/labs/index.md b/docs/labs/index.md index 28d1d8b34..48a09c842 100644 --- a/docs/labs/index.md +++ b/docs/labs/index.md @@ -132,7 +132,7 @@ Once you've completed Lab 1, explore advanced AgenticSession capabilities: - **Interactive sessions**: Build iterative development workflows using inbox/outbox communication - **Custom ProjectSettings**: Configure default models, timeouts, and team-specific settings - **API integration**: Automate session creation via REST API for CI/CD pipelines -- **CLAUDE.md exploration**: Deep-dive into the complete AgenticSession specification and backend architecture +- **AGENTS.md exploration**: Deep-dive into the complete AgenticSession specification and backend architecture ## Ready to Start? diff --git a/docs/reference/index.md b/docs/reference/index.md index 6cd2fc693..3138290a1 100644 --- a/docs/reference/index.md +++ b/docs/reference/index.md @@ -106,7 +106,7 @@ Specialized Custom Resource for Request for Enhancement workflows using a 7-agen **API Version**: `vteam.ambient-code/v1alpha1` **Kind**: `RFEWorkflow` -This is an advanced feature not covered in the standard user documentation. For implementation details, see the project's CLAUDE.md file in the repository root. +This is an advanced feature not covered in the standard user documentation. For implementation details, see the project's AGENTS.md file in the repository root. ## REST API Endpoints @@ -300,7 +300,7 @@ kubectl logs job/ -n - **User Guide**: [Getting Started](../user-guide/getting-started.md) for usage instructions - **Labs**: [Hands-on exercises](../labs/index.md) for practical learning - **Deployment Guides**: [OpenShift deployment](../OPENSHIFT_DEPLOY.md) for production setup -- **Contributing**: See the project's CLAUDE.md file in the repository root for contributor guidelines and architecture details +- **Contributing**: See the project's AGENTS.md file in the repository root for contributor guidelines and architecture details --- diff --git a/docs/user-guide/getting-started.md b/docs/user-guide/getting-started.md index 4baaec7e5..5dd6e4f63 100644 --- a/docs/user-guide/getting-started.md +++ b/docs/user-guide/getting-started.md @@ -189,7 +189,7 @@ Now that the Ambient Code Platform is running, you're ready to: If you encounter issues not covered here: -- **Check CLAUDE.md** in the repository root for detailed development documentation +- **Check AGENTS.md** in the repository root for detailed development documentation - **Search existing issues** β†’ [GitHub Issues](https://github.com/ambient-code/platform/issues) - **Create a new issue** with your error details and environment info diff --git a/docs/user-guide/index.md b/docs/user-guide/index.md index 53cb68327..ee020e999 100644 --- a/docs/user-guide/index.md +++ b/docs/user-guide/index.md @@ -125,7 +125,7 @@ If you encounter issues: - **Common problems**: See the [Troubleshooting section](getting-started.md#common-issues) in Getting Started - **Documentation bugs**: [Submit an issue](https://github.com/ambient-code/platform/issues) - **Questions**: [GitHub Discussions](https://github.com/ambient-code/platform/discussions) -- **CLAUDE.md**: Check the project root for detailed development documentation +- **AGENTS.md**: Check the project root for detailed development documentation --- diff --git a/docs/user-guide/working-with-amber.md b/docs/user-guide/working-with-amber.md index 23629734a..c08aa6780 100644 --- a/docs/user-guide/working-with-amber.md +++ b/docs/user-guide/working-with-amber.md @@ -33,7 +33,7 @@ Amber is the Ambient Code Platform's AI colleagueβ€”an expert in your codebase w | Category | What Amber Does | |----------|----------------| -| **Codebase Intelligence** | Deep knowledge of architecture, patterns (CLAUDE.md, DESIGN_GUIDELINES.md), dependencies (K8s, Claude SDK, OpenShift, Go, NextJS, Langfuse), common issues | +| **Codebase Intelligence** | Deep knowledge of architecture, patterns (AGENTS.md, DESIGN_GUIDELINES.md), dependencies (K8s, Claude SDK, OpenShift, Go, NextJS, Langfuse), common issues | | **Proactive Maintenance** | Monitors upstream for breaking changes, scans dependencies, detects issue patterns, generates health reports | | **Autonomy Levels** | Level 1: Read-only analysis; Level 2: Creates PRs for review; Level 3: Auto-merges low-risk changes; Level 4: Full autonomy (future) | @@ -122,7 +122,7 @@ Amber operates within a clear hierarchy to ensure quality and compliance: | Priority | What | Authority | Notes | |----------|------|-----------|-------| | **1** | **ACP Constitution** | Absolute | Amber cannot violate constitution principles, even if you ask | -| **2** | **CLAUDE.md** | High | Project standards; negotiable with your approval | +| **2** | **AGENTS.md** | High | Project standards; negotiable with your approval | | **3** | **Amber's Expertise** | Medium | ACP-specific guidance within constitutional bounds | | **4** | **Your Instructions** | Variable | Must align with constitution and project standards | @@ -130,7 +130,7 @@ Amber operates within a clear hierarchy to ensure quality and compliance: βœ… **Amber will decline**: Requests that violate the constitution (e.g., "skip tests", "use panic()", "commit without linting") -⚠️ **Amber will warn**: Deviations from CLAUDE.md preferences (e.g., "docker instead of podman") but proceed if you confirm +⚠️ **Amber will warn**: Deviations from AGENTS.md preferences (e.g., "docker instead of podman") but proceed if you confirm βœ… **Amber will implement**: Your task requirements within constitutional and project compliance @@ -481,7 +481,7 @@ sequenceDiagram else PR Created GH->>A: PR created webhook A->>A: Check linting compliance - A->>A: Verify standards (CLAUDE.md) + A->>A: Verify standards (AGENTS.md) A->>A: Scan for breaking changes alt Unique Value Found A->>GH: Add inline comment @@ -525,7 +525,7 @@ find related issues, and suggest an assignee. ### Code Review ``` -Amber, review PR #456 for CLAUDE.md standards compliance, security concerns, +Amber, review PR #456 for AGENTS.md standards compliance, security concerns, performance impact, and missing tests. ```