Skip to content

Conversation

@Gkrumbach07
Copy link
Collaborator

Problem

E2E tests fail on fork PRs because they don't have access to GitHub secrets (ANTHROPIC_API_KEY).

Solution

Changed E2E workflow to ONLY run when safe-to-test label is added to a PR.

Security model:

  • Uses pull_request_target for secret access
  • Requires explicit maintainer approval via label
  • Runs full test suite with ANTHROPIC_API_KEY

Usage

# Maintainer reviews PR code
# Then adds label to trigger tests
gh pr edit <PR_NUMBER> --add-label safe-to-test

Benefits

  • 🔒 Secure: Maintainer review required before running with secrets
  • 🎯 Simple: One workflow, one trigger
  • 💯 Complete: Full agent interaction testing
  • Clean: Results show in Actions UI, no PR comment spam

Changes the E2E workflow to ONLY run when 'safe-to-test' label is added to a PR.

Security model:
- Uses pull_request_target for secret access
- Requires explicit maintainer approval via label
- Runs full test suite with ANTHROPIC_API_KEY

Usage:
  gh pr edit <PR_NUMBER> --add-label safe-to-test

Benefits:
- Secure: maintainer review required before running with secrets
- Simple: one workflow, one trigger
- Complete: full agent interaction testing
@Gkrumbach07 Gkrumbach07 added the safe-to-test Maintainer-approved: Run E2E tests with secrets label Feb 3, 2026
@Gkrumbach07
Copy link
Collaborator Author

Clean PR created from upstream/main

This PR contains just the E2E workflow changes:

  • Trigger only on safe-to-test label
  • Uses pull_request_target for secret access
  • No extra commits, no merge conflicts

The safe-to-test label is already added, so tests should be running now!

Watch here: https://github.com/ambient-code/platform/actions/workflows/e2e.yml

@github-actions
Copy link
Contributor

github-actions bot commented Feb 3, 2026

Claude Code Review

Summary

PR #576 simplifies the E2E workflow by removing PR comment notifications while maintaining the secure label-based trigger mechanism. This is a clean, focused improvement that reduces noise without sacrificing functionality.

Issues by Severity

🚫 Blocker Issues

None - PR is safe to merge as-is.

🔴 Critical Issues

None - No critical security or functional issues.

🟡 Major Issues

None - The changes are minimal and well-scoped.

🔵 Minor Issues

1. Missing Workflow Permissions Declaration

  • Location: .github/workflows/e2e.yml
  • Issue: The workflow uses pull_request_target but doesn't declare explicit permissions like other workflows
  • Recommendation: Add explicit permissions (contents: read, pull-requests: read) after the on: block
  • Severity: Minor - workflow still functions correctly, but explicit permissions are a security best practice

Positive Highlights

✅ Well-Executed Changes

  1. Focused Scope - Only removes comment functionality
  2. Security Model Intact - safe-to-test label requirement preserved
  3. Clean Diff - Removes 29 lines, adds 0
  4. No Regressions - All test execution logic unchanged
  5. Follows PR Description - Changes match stated intent

✅ Security Best Practices

  • Uses pull_request_target with label gate for secure fork PR testing
  • Checks label in dedicated job before running tests
  • Uses github.event.pull_request.head.sha for PR code checkout

✅ Alignment with Repository Standards

  • Follows CLAUDE.md Testing Strategy
  • Adheres to CI/CD Patterns
  • Maintains change detection for component builds
  • Preserves artifact upload on failure

Final Assessment

Verdict: ✅ Approve with minor recommendation

Reasoning:

  • Changes are minimal, focused, and safe
  • Security model unchanged
  • No functional regressions
  • Improves signal-to-noise ratio in PRs

Merge Readiness: Ready to merge as-is. Adding explicit permissions is a nice-to-have, not a blocker.


Review completed with memory context:

  • CLAUDE.md, Backend/Frontend patterns, Security standards, K8s/Error handling/React Query patterns, E2E testing docs

Reviewed by: Claude Code (Sonnet 4.5) | 2026-02-03


🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

  1. CLAUDE.md - Master project instructions, development standards
  2. backend-development.md - Go backend, K8s integration patterns
  3. frontend-development.md - NextJS, Shadcn UI, React Query patterns
  4. security-standards.md - Auth, RBAC, token handling
  5. k8s-client-usage.md - User token vs service account patterns
  6. error-handling.md - Consistent error patterns
  7. react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

@Gkrumbach07 Gkrumbach07 merged commit e050c32 into ambient-code:main Feb 3, 2026
5 of 6 checks passed
@Gkrumbach07
Copy link
Collaborator Author

🤖 Added Claude + Playwright MCP Live Testing!

I've added a second workflow for AI-powered exploratory testing:

Two Testing Workflows

1. Regular E2E Tests (safe-to-test label)

  • Cypress scripted tests
  • Fast (~15 seconds)
  • Validates known user flows
  • All 12 tests including agent interaction

2. Claude Live Testing (claude-test label) ✨ NEW

  • AI-powered exploratory testing
  • Claude navigates and tests intelligently
  • Generates test reports with screenshots
  • Uploads Playwright traces with video
  • Tests edge cases and unexpected scenarios

How to Use

# Regular E2E tests
gh pr edit <PR> --add-label safe-to-test

# Claude exploratory testing
gh pr edit <PR> --add-label claude-test

What Claude Will Do

  1. 📖 Read the PR changes
  2. 🌐 Navigate to deployed application
  3. 🧪 Test the feature intelligently
  4. 🔍 Explore edge cases
  5. 📹 Record everything with Playwright
  6. 📝 Generate natural language test report

Powered By

  • Playwright MCP - Browser automation via Model Context Protocol
  • Claude Sonnet 4 - Intelligent test execution
  • Playwright Traces - Video recordings with timeline

Note

The current implementation shows the architecture. Full MCP integration requires implementing the protocol client. This provides:

  • ✅ Infrastructure setup
  • ✅ Playwright MCP server
  • ✅ Workflow triggers
  • 🔄 MCP protocol client (needs implementation)

Perfect for combining with our existing Claude code review workflow! 🎯

@Gkrumbach07
Copy link
Collaborator Author

🚀 Updated: Added Claude + Playwright MCP Testing!

The PR now includes TWO testing workflows:

1️⃣ Regular E2E Tests (safe-to-test label)

  • ✅ Cypress scripted tests
  • ✅ Fast (~15 seconds)
  • ✅ 12 tests including agent interaction

2️⃣ Claude Live Testing (claude-test label) ✨ NEW

  • 🤖 AI-powered exploratory testing
  • 🌐 Claude navigates and tests intelligently using Playwright MCP
  • 📹 Video recordings via Playwright traces
  • 📝 Natural language test reports
  • 🔍 Explores edge cases

How Claude Tests

Claude reads PR → Navigates to deployed app → 
Tests features with browser tools → Reports findings + video

Uses Claude Code Action's native MCP support:
https://code.claude.com/docs/en/github-actions#pass-cli-arguments

claude_args: --mcp-config playwright-config.json

Usage

# Fast scripted tests
gh pr edit <PR> --add-label safe-to-test

# AI exploratory testing with video
gh pr edit <PR> --add-label claude-test

Perfect complement to our existing Claude code review! 🎯

@Gkrumbach07
Copy link
Collaborator Author

📦 Split into Separate PRs

I've split the features into two PRs:

This PR (#576): E2E workflow changes (safe-to-test label)
New PR (#578): Claude + Playwright MCP live testing (claude-test label)

Cleaner to review and merge separately!

The claude-test label has been added to #578, so the workflow is running now to test itself (meta! 🤯)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

safe-to-test Maintainer-approved: Run E2E tests with secrets

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant