Skip to content

[WIP] Add daily agentic workflow for pull request analysis#1885

Merged
pelikhan merged 4 commits intomainfrom
copilot/add-daily-workflow-analysis
Oct 17, 2025
Merged

[WIP] Add daily agentic workflow for pull request analysis#1885
pelikhan merged 4 commits intomainfrom
copilot/add-daily-workflow-analysis

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Oct 17, 2025

✅ Complete Implementation

Successfully implemented a comprehensive daily workflow to analyze copilot-swe-agent pull requests.

All Requirements Met

  • Create new daily workflow .github/workflows/copilot-agent-analysis.md
  • Configure workflow with Claude engine and appropriate tools
  • Add GitHub tools for PR search and analysis (search_pull_requests, pull_request_read, list_pull_requests)
  • Configure cache-memory for historical trending data
  • Add safe-outputs for creating issues with analysis results
  • Implement PR analysis logic:
    • Search for PRs created by copilot-swe-agent in last 24h
    • For each PR: determine outcome (merged vs closed), comment count, timing data
    • Generate summary table with metrics
  • Implement historical trending:
    • Store daily metrics in cache memory
    • Analyze trends over time (success rate, avg time, etc.)
    • Compare with historical data (7-day and 30-day trends)
  • Check for instruction/prompt file changes:
    • Monitor changes to .github/ directory
    • Correlate with performance changes
  • Test workflow compilation with gh-aw compile
  • Validate with Go tests and linting
  • Address code review feedback
  • Run CodeQL security scan (0 issues found)

Implementation Summary

Workflow Details:

  • File: .github/workflows/copilot-agent-analysis.md (451 lines)
  • Compiled: .github/workflows/copilot-agent-analysis.lock.yml (3903 lines, 200KB)
  • Schedule: Daily at 10:00 AM UTC via cron
  • Engine: Claude with Claude Code
  • Timeout: 15 minutes
  • Status: ✅ Compiles successfully, ✅ Tests pass, ✅ Linting pass, ✅ Security scan pass

Core Features:

  1. PR Analysis (Phase 1-2)

    • Searches for copilot-swe-agent PRs from last 24h
    • Handles multiple possible bot usernames
    • Determines outcome: merged ✅, closed ❌, or open ⏳
    • Counts human comments (filters out bots)
    • Tracks timing: agent duration + total PR lifetime
    • Extracts files changed and commit count
  2. Summary Table (Phase 3)

    • Formatted markdown table with all metrics
    • Visual indicators for status
    • Summary statistics (success rate, averages)
  3. Historical Trending (Phase 4)

    • Uses cache-memory at /tmp/gh-aw/cache-memory/copilot-agent-metrics/
    • Stores daily metrics in history.json
    • 7-day trend comparison with indicators
    • 30-day moving averages (when available)
    • Trend indicators: 📈 improving, 📉 declining, ➡️ stable
  4. Instruction Correlation (Phase 5)

    • Monitors .github/ for instruction file changes
    • Checks git log for changes in last 7 days
    • Correlates with performance changes
    • Provides before/after comparison
  5. Automated Reporting (Phase 6)

    • Creates comprehensive GitHub issues
    • Includes all analysis and recommendations
    • Auto-labels: automation, metrics, copilot-swe-agent
    • Uses safe-outputs for secure issue creation

Tools & Permissions:

  • GitHub API tools: search_pull_requests, pull_request_read, list_pull_requests, get_file_contents, list_commits, get_commit
  • Bash commands: safe file operations with error handling
  • Network: defaults + github domains
  • Permissions: contents:read, pull-requests:read, actions:read (minimal)
  • Safe-outputs handles issues:write separately

Safety & Security:

  • Bash commands handle missing directories gracefully
  • Error handling for all file operations
  • No security vulnerabilities (CodeQL verified)
  • Follows principle of least privilege
  • Safe-outputs pattern for write operations

Testing & Validation:

  • ✅ Workflow compiles successfully
  • ✅ All Go unit tests pass
  • ✅ Linting passes (Go and JavaScript)
  • ✅ CodeQL security scan: 0 issues
  • ✅ Code review feedback addressed

Documentation:

  • Comprehensive inline documentation in workflow
  • 6 well-defined phases with clear instructions
  • Edge case handling documented
  • Success criteria defined
  • Troubleshooting guidelines included

Example Output

The workflow will generate daily issues like:

# 🤖 Copilot Agent PR Analysis - 2024-10-17

## Summary
**Analysis Period**: Last 24 hours
**Total PRs Analyzed**: 5
**Success Rate**: 80%

## PR Summary Table
| PR # | Title | Outcome | Comments | Agent Duration | Total Duration | Files | Status |
|------|-------|---------|----------|----------------|----------------|-------|--------|
| #123 | Fix bug | Merged | 3 | 12m | 1h 45m | 4 ||
| #124 | Add feature | Merged | 5 | 18m | 3h 20m | 8 ||
...

## Historical Comparison (7-day trend)
- **Success Rate**: 80% vs 75% (7-day avg) 📈 +5%
- **Agent Duration**: 15m vs 20m (7-day avg) 📈 25% faster
- **Human Engagement**: 4.0 comments vs 3.5 (7-day avg) 📈 +14%

## Instruction File Changes
[Correlation analysis if changes detected]

## Recommendations
- Success rate is improving - current instruction set appears effective
- Agent is completing work faster than historical average

Files Changed

  • Added: .github/workflows/copilot-agent-analysis.md (13KB)
  • Added: .github/workflows/copilot-agent-analysis.lock.yml (200KB)

Next Steps

This PR is ready for review. Once merged, the workflow will:

  1. Run automatically every day at 10:00 AM UTC
  2. Analyze all copilot-swe-agent PRs from the previous 24 hours
  3. Build historical data in cache-memory
  4. Create daily issues with comprehensive analysis and trends
  5. Provide actionable insights into agent performance

The workflow can also be manually triggered via workflow_dispatch for testing or ad-hoc analysis.

Original prompt

Add a daily agentic workflow that analyzes the pull requests created by copilot agent (copilot-swe-agent) in the last 24h.

For each PR, determine outcome (merged vs closed), number of human comments, time taken by the agent to start/finish. Generate summary table.

Add caching for historical trending analysis. Based on historical data provide trending analysis on performance of the agent.

Check for changes in instructions/prompt file to correlate with performance changes.

Use Claude.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 2 commits October 17, 2025 20:43
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
- Creates new workflow that runs daily at 10am UTC
- Analyzes PRs created by copilot-swe-agent in last 24h
- Tracks metrics: outcome, comments, timing, files changed
- Implements cache-memory for historical data storage
- Provides trending analysis comparing current vs historical performance
- Monitors instruction file changes and correlates with performance
- Uses Claude engine with GitHub API tools
- Creates issues with comprehensive analysis and recommendations

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
@pelikhan pelikhan marked this pull request as ready for review October 17, 2025 20:49
@github-actions
Copy link
Copy Markdown
Contributor

Agentic Changeset Generator triggered by this pull request.

- Fix bash commands to handle missing directories safely
- Improve pull_request_read usage instructions
- Add error handling for find commands
- Update bash tool permissions to be safer

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
@pelikhan pelikhan merged commit 28f78d4 into main Oct 17, 2025
4 checks passed
@pelikhan pelikhan deleted the copilot/add-daily-workflow-analysis branch October 17, 2025 20:58
Copilot AI requested a review from pelikhan October 17, 2025 20:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants