fix(e2e): simplify E2E workflow to trigger only on safe-to-test label #576

Gkrumbach07 · 2026-02-03T21:20:48Z

Problem

E2E tests fail on fork PRs because they don't have access to GitHub secrets (ANTHROPIC_API_KEY).

Solution

Changed E2E workflow to ONLY run when safe-to-test label is added to a PR.

Security model:

Uses pull_request_target for secret access
Requires explicit maintainer approval via label
Runs full test suite with ANTHROPIC_API_KEY

Usage

# Maintainer reviews PR code
# Then adds label to trigger tests
gh pr edit <PR_NUMBER> --add-label safe-to-test

Benefits

🔒 Secure: Maintainer review required before running with secrets
🎯 Simple: One workflow, one trigger
💯 Complete: Full agent interaction testing
✅ Clean: Results show in Actions UI, no PR comment spam

Changes the E2E workflow to ONLY run when 'safe-to-test' label is added to a PR. Security model: - Uses pull_request_target for secret access - Requires explicit maintainer approval via label - Runs full test suite with ANTHROPIC_API_KEY Usage: gh pr edit <PR_NUMBER> --add-label safe-to-test Benefits: - Secure: maintainer review required before running with secrets - Simple: one workflow, one trigger - Complete: full agent interaction testing

Gkrumbach07 · 2026-02-03T21:21:14Z

✅ Clean PR created from upstream/main

This PR contains just the E2E workflow changes:

Trigger only on safe-to-test label
Uses pull_request_target for secret access
No extra commits, no merge conflicts

The safe-to-test label is already added, so tests should be running now!

Watch here: https://github.com/ambient-code/platform/actions/workflows/e2e.yml

github-actions · 2026-02-03T21:24:17Z

Claude Code Review

Summary

PR #576 simplifies the E2E workflow by removing PR comment notifications while maintaining the secure label-based trigger mechanism. This is a clean, focused improvement that reduces noise without sacrificing functionality.

Issues by Severity

🚫 Blocker Issues

None - PR is safe to merge as-is.

🔴 Critical Issues

None - No critical security or functional issues.

🟡 Major Issues

None - The changes are minimal and well-scoped.

🔵 Minor Issues

1. Missing Workflow Permissions Declaration

Location: .github/workflows/e2e.yml
Issue: The workflow uses pull_request_target but doesn't declare explicit permissions like other workflows
Recommendation: Add explicit permissions (contents: read, pull-requests: read) after the on: block
Severity: Minor - workflow still functions correctly, but explicit permissions are a security best practice

Positive Highlights

✅ Well-Executed Changes

Focused Scope - Only removes comment functionality
Security Model Intact - safe-to-test label requirement preserved
Clean Diff - Removes 29 lines, adds 0
No Regressions - All test execution logic unchanged
Follows PR Description - Changes match stated intent

✅ Security Best Practices

Uses pull_request_target with label gate for secure fork PR testing
Checks label in dedicated job before running tests
Uses github.event.pull_request.head.sha for PR code checkout

✅ Alignment with Repository Standards

Follows CLAUDE.md Testing Strategy
Adheres to CI/CD Patterns
Maintains change detection for component builds
Preserves artifact upload on failure

Final Assessment

Verdict: ✅ Approve with minor recommendation

Reasoning:

Changes are minimal, focused, and safe
Security model unchanged
No functional regressions
Improves signal-to-noise ratio in PRs

Merge Readiness: Ready to merge as-is. Adding explicit permissions is a nice-to-have, not a blocker.

Review completed with memory context:

CLAUDE.md, Backend/Frontend patterns, Security standards, K8s/Error handling/React Query patterns, E2E testing docs

Reviewed by: Claude Code (Sonnet 4.5) | 2026-02-03

🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

CLAUDE.md - Master project instructions, development standards
backend-development.md - Go backend, K8s integration patterns
frontend-development.md - NextJS, Shadcn UI, React Query patterns
security-standards.md - Auth, RBAC, token handling
k8s-client-usage.md - User token vs service account patterns
error-handling.md - Consistent error patterns
react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

Gkrumbach07 · 2026-02-03T23:18:16Z

🤖 Added Claude + Playwright MCP Live Testing!

I've added a second workflow for AI-powered exploratory testing:

Two Testing Workflows

1. Regular E2E Tests (safe-to-test label)

Cypress scripted tests
Fast (~15 seconds)
Validates known user flows
All 12 tests including agent interaction

2. Claude Live Testing (claude-test label) ✨ NEW

AI-powered exploratory testing
Claude navigates and tests intelligently
Generates test reports with screenshots
Uploads Playwright traces with video
Tests edge cases and unexpected scenarios

How to Use

# Regular E2E tests
gh pr edit <PR> --add-label safe-to-test

# Claude exploratory testing
gh pr edit <PR> --add-label claude-test

What Claude Will Do

📖 Read the PR changes
🌐 Navigate to deployed application
🧪 Test the feature intelligently
🔍 Explore edge cases
📹 Record everything with Playwright
📝 Generate natural language test report

Powered By

Playwright MCP - Browser automation via Model Context Protocol
Claude Sonnet 4 - Intelligent test execution
Playwright Traces - Video recordings with timeline

Note

The current implementation shows the architecture. Full MCP integration requires implementing the protocol client. This provides:

✅ Infrastructure setup
✅ Playwright MCP server
✅ Workflow triggers
🔄 MCP protocol client (needs implementation)

Perfect for combining with our existing Claude code review workflow! 🎯

Gkrumbach07 · 2026-02-03T23:23:11Z

🚀 Updated: Added Claude + Playwright MCP Testing!

The PR now includes TWO testing workflows:

1️⃣ Regular E2E Tests (`safe-to-test` label)

✅ Cypress scripted tests
✅ Fast (~15 seconds)
✅ 12 tests including agent interaction

2️⃣ Claude Live Testing (`claude-test` label) ✨ NEW

🤖 AI-powered exploratory testing
🌐 Claude navigates and tests intelligently using Playwright MCP
📹 Video recordings via Playwright traces
📝 Natural language test reports
🔍 Explores edge cases

How Claude Tests

Claude reads PR → Navigates to deployed app → 
Tests features with browser tools → Reports findings + video

Uses Claude Code Action's native MCP support:
https://code.claude.com/docs/en/github-actions#pass-cli-arguments

claude_args: --mcp-config playwright-config.json

Usage

# Fast scripted tests
gh pr edit <PR> --add-label safe-to-test

# AI exploratory testing with video
gh pr edit <PR> --add-label claude-test

Perfect complement to our existing Claude code review! 🎯

Gkrumbach07 · 2026-02-03T23:43:32Z

📦 Split into Separate PRs

I've split the features into two PRs:

This PR (#576): E2E workflow changes (safe-to-test label)
New PR (#578): Claude + Playwright MCP live testing (claude-test label)

Cleaner to review and merge separately!

The claude-test label has been added to #578, so the workflow is running now to test itself (meta! 🤯)

Gkrumbach07 added the safe-to-test Maintainer-approved: Run E2E tests with secrets label Feb 3, 2026

Gkrumbach07 merged commit e050c32 into ambient-code:main Feb 3, 2026
5 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(e2e): simplify E2E workflow to trigger only on safe-to-test label #576

fix(e2e): simplify E2E workflow to trigger only on safe-to-test label #576

Uh oh!

Gkrumbach07 commented Feb 3, 2026

Uh oh!

Gkrumbach07 commented Feb 3, 2026

Uh oh!

github-actions bot commented Feb 3, 2026 •

edited

Loading

What Amber Loaded for Code Review

Uh oh!

Uh oh!

Gkrumbach07 commented Feb 3, 2026

Uh oh!

Gkrumbach07 commented Feb 3, 2026

Uh oh!

Gkrumbach07 commented Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fix(e2e): simplify E2E workflow to trigger only on safe-to-test label #576

fix(e2e): simplify E2E workflow to trigger only on safe-to-test label #576

Uh oh!

Conversation

Gkrumbach07 commented Feb 3, 2026

Problem

Solution

Usage

Benefits

Uh oh!

Gkrumbach07 commented Feb 3, 2026

Uh oh!

github-actions bot commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Claude Code Review

Summary

Issues by Severity

🚫 Blocker Issues

🔴 Critical Issues

🟡 Major Issues

🔵 Minor Issues

Positive Highlights

✅ Well-Executed Changes

✅ Security Best Practices

✅ Alignment with Repository Standards

Final Assessment

What Amber Loaded for Code Review

Uh oh!

Uh oh!

Gkrumbach07 commented Feb 3, 2026

🤖 Added Claude + Playwright MCP Live Testing!

Two Testing Workflows

How to Use

What Claude Will Do

Powered By

Note

Uh oh!

Gkrumbach07 commented Feb 3, 2026

🚀 Updated: Added Claude + Playwright MCP Testing!

1️⃣ Regular E2E Tests (safe-to-test label)

2️⃣ Claude Live Testing (claude-test label) ✨ NEW

How Claude Tests

Usage

Uh oh!

Gkrumbach07 commented Feb 3, 2026

📦 Split into Separate PRs

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions bot commented Feb 3, 2026 •

edited

Loading

1️⃣ Regular E2E Tests (`safe-to-test` label)

2️⃣ Claude Live Testing (`claude-test` label) ✨ NEW