From 61a7c8681801ea7569931e2ad04fa8fa2e9b06cb Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Fri, 20 Mar 2026 21:50:01 +0000 Subject: [PATCH 1/7] Initial plan From 487b7af975ba33688663b876a252e061f1ae2347 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Fri, 20 Mar 2026 21:56:55 +0000 Subject: [PATCH 2/7] feat: add data-science.md prompt and link in agentic-workflows dispatcher Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/41a439a6-1ec6-4873-a05a-f208b0b23e6f --- .github/agents/agentic-workflows.agent.md | 11 ++ .github/aw/data-science.md | 210 ++++++++++++++++++++++ 2 files changed, 221 insertions(+) create mode 100644 .github/aw/data-science.md diff --git a/.github/agents/agentic-workflows.agent.md b/.github/agents/agentic-workflows.agent.md index c0f21877e1b..ad20d4bc945 100644 --- a/.github/agents/agentic-workflows.agent.md +++ b/.github/agents/agentic-workflows.agent.md @@ -129,6 +129,17 @@ When you interact with this agent, it will: - "Analyze coverage trends over time" - "Add a coverage gate that blocks PRs below a threshold" +### Generate Charts and Trend Visualizations +**Load when**: The workflow generates Python-based charts, trend visualizations, data dashboards, or any time-series analysis โ€” whether triggered on a schedule, by a slash command, or on demand. + +**Prompt file**: https://github.com/github/gh-aw/blob/main/.github/aw/data-science.md + +**Use cases**: +- "Create a weekly chart of GitHub activity trends" +- "Visualize test coverage over time with moving averages" +- "Generate a data dashboard for workflow run durations" +- "Plot contributor growth month-over-month" + ## Instructions When a user interacts with you: diff --git a/.github/aw/data-science.md b/.github/aw/data-science.md new file mode 100644 index 00000000000..580b2fc06ec --- /dev/null +++ b/.github/aw/data-science.md @@ -0,0 +1,210 @@ +--- +description: Guidelines for creating agentic workflows that generate charts and trend visualizations using Python scientific computing libraries with persistent historical data. +--- + +# Data Science & Chart Generation + +Consult this file when creating an agentic workflow that generates charts, visualizations, or trend analysis โ€” including data dashboards, metric reports, time-series plots, or any Python-based visualization output. + +## Choosing the Right Shared Import + +| Goal | Import | +|---|---| +| Generate charts + persistent trend tracking | `shared/charts-with-trending.md` | +| Quick trending setup, no nested imports | `shared/trending-charts-simple.md` | +| Python environment only, no cache-memory | `shared/python-dataviz.md` | + +Use `shared/charts-with-trending.md` by default when the workflow needs to track metrics across runs. Use `shared/trending-charts-simple.md` when strict-mode compatibility or a minimal configuration is preferred. + +## Minimal Frontmatter + +```yaml +--- +description: [what the workflow visualizes] +on: + schedule: + - cron: "0 9 * * 1" # example: every Monday at 09:00 UTC + workflow_dispatch: +permissions: + contents: read + actions: read # add issue/discussion scopes when needed +engine: copilot +imports: + - shared/charts-with-trending.md +safe-outputs: + upload-asset: + create-issue: # or create-discussion for gallery-style reports + title-prefix: "๐Ÿ“Š [Report Name]:" + labels: [report] + close-older-issues: true + expires: 30 +--- +``` + +## Environment Reference + +The import sets up everything automatically: + +| Location | Purpose | +|---|---| +| `/tmp/gh-aw/python/` | Working directory for scripts | +| `/tmp/gh-aw/python/data/` | Input data files (CSV, JSON) | +| `/tmp/gh-aw/python/charts/` | Generated chart images (PNG) | +| `/tmp/gh-aw/cache-memory/trending/` | Persistent time-series history | + +**Libraries available**: NumPy, Pandas, Matplotlib, Seaborn, SciPy + +## Writing the Agent Prompt + +A well-structured prompt for a data-visualization workflow has these phases: + +### Phase 1 โ€“ Load historical data +```markdown +1. Check `/tmp/gh-aw/cache-memory/trending//history.jsonl` for existing data. +2. Load it into a Pandas DataFrame if it exists; otherwise start fresh. +``` + +### Phase 2 โ€“ Collect or generate data +```markdown +1. Collect today's metrics from the GitHub API (or generate sample data with NumPy). +2. Save raw data to `/tmp/gh-aw/python/data/.csv` or `.json` โ€” never inline data in Python code. +``` + +### Phase 3 โ€“ Append to history (JSON Lines) +```markdown +Append a new record to history.jsonl with ISO 8601 timestamp, metric name, value, and metadata. +Implement a 90-day retention policy to prevent unbounded growth. +``` + +### Phase 4 โ€“ Generate charts +```markdown +1. Create trend charts if โ‰ฅ 2 historical data points exist: + - Time-series line chart with 7-day moving average + - Comparative trend chart for multiple metrics +2. Fall back to bar/distribution charts when history is empty. +3. Save all charts to `/tmp/gh-aw/python/charts/` at DPI 300, seaborn style. +``` + +### Phase 5 โ€“ Upload and report +```markdown +1. Upload each chart using the `upload asset` tool. +2. Create an issue (or discussion) embedding the uploaded chart URLs in markdown. +3. If no meaningful data was found, call `noop` with a brief explanation. +``` + +## Data Rules + +**CRITICAL**: Data must never be inlined in Python code. Always write data to a file first, then load it with pandas: + +```python +# โŒ PROHIBITED +data = [10, 20, 30, 40, 50] + +# โœ… REQUIRED +import pandas as pd +data = pd.read_csv('/tmp/gh-aw/python/data/metrics.csv') +``` + +## Trending Patterns + +### Append a daily data point + +```python +import json +from datetime import datetime + +point = {"timestamp": datetime.now().isoformat(), "value": 42, "metric": "issue_count"} +with open('/tmp/gh-aw/cache-memory/trending/issues/history.jsonl', 'a') as f: + f.write(json.dumps(point) + '\n') +``` + +### Load history and compute a 7-day moving average + +```python +import pandas as pd + +df = pd.read_json('/tmp/gh-aw/cache-memory/trending/issues/history.jsonl', lines=True) +df['date'] = pd.to_datetime(df['timestamp']).dt.date +df = df.sort_values('timestamp') +df['rolling_avg'] = df['value'].rolling(window=7, min_periods=1).mean() +``` + +### Enforce 90-day retention + +```python +from datetime import timedelta + +cutoff = pd.Timestamp.now() - timedelta(days=90) +df = df[pd.to_datetime(df['timestamp']) >= cutoff] +df.to_json('/tmp/gh-aw/cache-memory/trending/issues/history.jsonl', + orient='records', lines=True) +``` + +## Chart Quality Settings + +```python +import matplotlib.pyplot as plt +import seaborn as sns + +sns.set_style("whitegrid") +sns.set_palette("husl") + +fig, ax = plt.subplots(figsize=(12, 7), dpi=300) +# ... plotting code ... +plt.tight_layout() +plt.savefig('/tmp/gh-aw/python/charts/trend.png', + dpi=300, bbox_inches='tight', facecolor='white') +``` + +**Standards**: 300 DPI minimum ยท 12ร—7 inch figure ยท clear axis labels and title ยท legend for multi-series ยท grid lines enabled + +## Report Structure + +When creating the issue or discussion, use this template: + +```markdown +# ๐Ÿ“Š [Report Title] โ€” [Date] + +## Summary +[2โ€“3 sentences describing trends and key findings] + +## [Metric 1] Trend +![Metric 1 trend chart](URL_FROM_UPLOAD_ASSET) +[Brief analysis: direction, moving average, notable events] + +## [Metric 2] Distribution +![Metric 2 chart](URL_FROM_UPLOAD_ASSET) +[Brief analysis] + +## Data Details +- **Source**: [GitHub API / generated sample / external] +- **Data points**: [count] +- **Date range**: [start] to [end] +- **Tracking period**: [N] days + +## Cache Status +- **Metrics tracked**: [list] +- **History location**: `/tmp/gh-aw/cache-memory/trending/` +``` + +Use `###` and lower for all headers inside the report body โ€” `#` and `##` are reserved for issue/discussion titles. + +## Common Use Cases + +| Intent | Notes | +|---|---| +| "Create a weekly GitHub activity chart" | Schedule weekly; track issues, PRs, commits | +| "Visualize test coverage trends over time" | Trigger on push/PR; append per-run metrics | +| "Generate a dashboard of workflow run durations" | Schedule daily; use GitHub Actions API | +| "Plot stale repo aging distribution" | On-demand; no trending needed, skip cache | +| "Track contributor growth month-over-month" | Schedule monthly; long retention (365 days) | + +## Key Reminders + +- โœ… **Import the right shared workflow** โ€” `charts-with-trending.md` or `trending-charts-simple.md` +- โœ… **Check cache first** โ€” load historical data before collecting new data +- โœ… **Append, never overwrite** โ€” use JSON Lines for time-series history +- โœ… **External data files only** โ€” never inline data in Python +- โœ… **Upload charts before reporting** โ€” collect all asset URLs, then create the issue/discussion +- โœ… **Call `noop` if nothing to report** โ€” required when no safe-output action is taken +- โœ… **Use DPI 300** and seaborn styling for publication-quality charts From 780f8b10c487781ce7280bab7017af9705ef43c6 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Fri, 20 Mar 2026 22:08:37 +0000 Subject: [PATCH 3/7] fix: make data-science.md self-contained with shared/* imports Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/5a732eef-0ee7-4160-b82d-d14d7d8a64fd --- .github/agents/agentic-workflows.agent.md | 11 ----------- .github/aw/data-science.md | 4 ++++ 2 files changed, 4 insertions(+), 11 deletions(-) diff --git a/.github/agents/agentic-workflows.agent.md b/.github/agents/agentic-workflows.agent.md index ad20d4bc945..c0f21877e1b 100644 --- a/.github/agents/agentic-workflows.agent.md +++ b/.github/agents/agentic-workflows.agent.md @@ -129,17 +129,6 @@ When you interact with this agent, it will: - "Analyze coverage trends over time" - "Add a coverage gate that blocks PRs below a threshold" -### Generate Charts and Trend Visualizations -**Load when**: The workflow generates Python-based charts, trend visualizations, data dashboards, or any time-series analysis โ€” whether triggered on a schedule, by a slash command, or on demand. - -**Prompt file**: https://github.com/github/gh-aw/blob/main/.github/aw/data-science.md - -**Use cases**: -- "Create a weekly chart of GitHub activity trends" -- "Visualize test coverage over time with moving averages" -- "Generate a data dashboard for workflow run durations" -- "Plot contributor growth month-over-month" - ## Instructions When a user interacts with you: diff --git a/.github/aw/data-science.md b/.github/aw/data-science.md index 580b2fc06ec..0d4f94dca09 100644 --- a/.github/aw/data-science.md +++ b/.github/aw/data-science.md @@ -1,5 +1,9 @@ --- description: Guidelines for creating agentic workflows that generate charts and trend visualizations using Python scientific computing libraries with persistent historical data. +imports: + - shared/python-dataviz.md + - shared/charts-with-trending.md + - shared/trending-charts-simple.md --- # Data Science & Chart Generation From 35c50dfe090a2875a1d051a953108b2f2159b080 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Fri, 20 Mar 2026 22:17:38 +0000 Subject: [PATCH 4/7] =?UTF-8?q?fix:=20inline=20shared=20workflow=20content?= =?UTF-8?q?=20into=20data-science.md=20=E2=80=94=20no=20imports,=20fully?= =?UTF-8?q?=20self-contained?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/ef56ef3f-3d14-48e2-9091-d0d4985f238a --- .github/aw/data-science.md | 144 ++++++++++++++++++++++++++++--------- 1 file changed, 112 insertions(+), 32 deletions(-) diff --git a/.github/aw/data-science.md b/.github/aw/data-science.md index 0d4f94dca09..46ed81f5b96 100644 --- a/.github/aw/data-science.md +++ b/.github/aw/data-science.md @@ -1,26 +1,12 @@ --- description: Guidelines for creating agentic workflows that generate charts and trend visualizations using Python scientific computing libraries with persistent historical data. -imports: - - shared/python-dataviz.md - - shared/charts-with-trending.md - - shared/trending-charts-simple.md --- # Data Science & Chart Generation Consult this file when creating an agentic workflow that generates charts, visualizations, or trend analysis โ€” including data dashboards, metric reports, time-series plots, or any Python-based visualization output. -## Choosing the Right Shared Import - -| Goal | Import | -|---|---| -| Generate charts + persistent trend tracking | `shared/charts-with-trending.md` | -| Quick trending setup, no nested imports | `shared/trending-charts-simple.md` | -| Python environment only, no cache-memory | `shared/python-dataviz.md` | - -Use `shared/charts-with-trending.md` by default when the workflow needs to track metrics across runs. Use `shared/trending-charts-simple.md` when strict-mode compatibility or a minimal configuration is preferred. - -## Minimal Frontmatter +## Frontmatter Template ```yaml --- @@ -31,10 +17,17 @@ on: workflow_dispatch: permissions: contents: read - actions: read # add issue/discussion scopes when needed + actions: read # add issues/discussions scopes when needed engine: copilot -imports: - - shared/charts-with-trending.md +tools: + cache-memory: + key: trending-data-${{ github.workflow }}-${{ github.run_id }} + bash: + - "*" +network: + allowed: + - defaults + - python safe-outputs: upload-asset: create-issue: # or create-discussion for gallery-style reports @@ -42,21 +35,46 @@ safe-outputs: labels: [report] close-older-issues: true expires: 30 +steps: + - name: Setup Python environment + run: | + mkdir -p /tmp/gh-aw/python/{data,charts,artifacts} + mkdir -p /tmp/gh-aw/cache-memory/trending + pip install --user --quiet numpy pandas matplotlib seaborn scipy + - name: Upload charts + if: always() + uses: actions/upload-artifact@v7.0.0 + with: + name: data-charts + path: /tmp/gh-aw/python/charts/*.png + if-no-files-found: warn + retention-days: 30 + - name: Upload source files and data + if: always() + uses: actions/upload-artifact@v7.0.0 + with: + name: python-source-and-data + path: | + /tmp/gh-aw/python/*.py + /tmp/gh-aw/python/data/* + if-no-files-found: warn + retention-days: 30 --- ``` -## Environment Reference - -The import sets up everything automatically: +## Environment | Location | Purpose | |---|---| | `/tmp/gh-aw/python/` | Working directory for scripts | | `/tmp/gh-aw/python/data/` | Input data files (CSV, JSON) | | `/tmp/gh-aw/python/charts/` | Generated chart images (PNG) | -| `/tmp/gh-aw/cache-memory/trending/` | Persistent time-series history | +| `/tmp/gh-aw/python/artifacts/` | Additional output files | +| `/tmp/gh-aw/cache-memory/trending/` | Persistent time-series history across runs | + +**Libraries**: NumPy ยท Pandas ยท Matplotlib ยท Seaborn ยท SciPy -**Libraries available**: NumPy, Pandas, Matplotlib, Seaborn, SciPy +Charts and Python source files are automatically uploaded as GitHub Actions artifacts (`data-charts`, `python-source-and-data`, retention 30 days) so they are available even if the workflow fails. ## Writing the Agent Prompt @@ -98,7 +116,7 @@ Implement a 90-day retention policy to prevent unbounded growth. ## Data Rules -**CRITICAL**: Data must never be inlined in Python code. Always write data to a file first, then load it with pandas: +**CRITICAL**: Data must never be inlined in Python code. Always write data to an external file first, then load it with pandas: ```python # โŒ PROHIBITED @@ -111,26 +129,71 @@ data = pd.read_csv('/tmp/gh-aw/python/data/metrics.csv') ## Trending Patterns +Cache-memory at `/tmp/gh-aw/cache-memory/trending/` persists across runs. Organize it as: + +``` +/tmp/gh-aw/cache-memory/trending/ +โ”œโ”€โ”€ / +โ”‚ โ”œโ”€โ”€ history.jsonl # Time-series data (one JSON object per line) +โ”‚ โ”œโ”€โ”€ metadata.json # Data schema and description +โ”‚ โ””โ”€โ”€ last_updated.txt # Timestamp of last update +โ””โ”€โ”€ index.json # Index of all tracked metrics +``` + ### Append a daily data point ```python import json from datetime import datetime -point = {"timestamp": datetime.now().isoformat(), "value": 42, "metric": "issue_count"} +point = { + "timestamp": datetime.now().isoformat(), + "metric": "issue_count", + "value": 42, + "metadata": {"source": "github_api"} +} with open('/tmp/gh-aw/cache-memory/trending/issues/history.jsonl', 'a') as f: f.write(json.dumps(point) + '\n') ``` -### Load history and compute a 7-day moving average +### Load history into a DataFrame ```python import pandas as pd +import os + +history_file = '/tmp/gh-aw/cache-memory/trending/issues/history.jsonl' +if os.path.exists(history_file): + df = pd.read_json(history_file, lines=True) + df['timestamp'] = pd.to_datetime(df['timestamp']) + df = df.sort_values('timestamp') +else: + df = pd.DataFrame() # Start fresh if no history +``` + +### Compute a 7-day moving average -df = pd.read_json('/tmp/gh-aw/cache-memory/trending/issues/history.jsonl', lines=True) -df['date'] = pd.to_datetime(df['timestamp']).dt.date -df = df.sort_values('timestamp') +```python df['rolling_avg'] = df['value'].rolling(window=7, min_periods=1).mean() + +fig, ax = plt.subplots(figsize=(12, 7), dpi=300) +ax.plot(df['timestamp'], df['value'], label='Actual', alpha=0.5, marker='o') +ax.plot(df['timestamp'], df['rolling_avg'], label='7-day Average', linewidth=2.5) +ax.fill_between(df['timestamp'], df['value'], df['rolling_avg'], alpha=0.2) +ax.legend(loc='best') +``` + +### Compare multiple metrics over time + +```python +fig, ax = plt.subplots(figsize=(14, 8), dpi=300) +for metric in ['metric_a', 'metric_b', 'metric_c']: + metric_data = df[df['metric'] == metric] + ax.plot(metric_data['timestamp'], metric_data['value'], + marker='o', label=metric, linewidth=2) +ax.set_title('Comparative Metrics Trends', fontsize=16, fontweight='bold') +ax.legend(loc='best', fontsize=12) +ax.grid(True, alpha=0.3) ``` ### Enforce 90-day retention @@ -139,7 +202,7 @@ df['rolling_avg'] = df['value'].rolling(window=7, min_periods=1).mean() from datetime import timedelta cutoff = pd.Timestamp.now() - timedelta(days=90) -df = df[pd.to_datetime(df['timestamp']) >= cutoff] +df = df[df['timestamp'] >= cutoff] df.to_json('/tmp/gh-aw/cache-memory/trending/issues/history.jsonl', orient='records', lines=True) ``` @@ -155,12 +218,29 @@ sns.set_palette("husl") fig, ax = plt.subplots(figsize=(12, 7), dpi=300) # ... plotting code ... +ax.set_title('Title', fontsize=16, fontweight='bold') +ax.set_xlabel('Date', fontsize=12) +ax.set_ylabel('Value', fontsize=12) +ax.grid(True, alpha=0.3) +plt.xticks(rotation=45) plt.tight_layout() plt.savefig('/tmp/gh-aw/python/charts/trend.png', dpi=300, bbox_inches='tight', facecolor='white') ``` -**Standards**: 300 DPI minimum ยท 12ร—7 inch figure ยท clear axis labels and title ยท legend for multi-series ยท grid lines enabled +**Standards**: 300 DPI minimum ยท 12ร—7 inch figure ยท clear axis labels and title ยท legend for multi-series ยท grid lines enabled ยท colorblind-friendly palette (seaborn `husl` default) + +## Including Charts in Reports + +Assets uploaded with the `upload asset` tool are published to an orphaned git branch and become URL-addressable after the workflow completes. + +```markdown +## Visualization Results + +![Chart description](https://raw.githubusercontent.com/owner/repo/assets/workflow-name/trend.png) + +The chart above shows... +``` ## Report Structure @@ -205,10 +285,10 @@ Use `###` and lower for all headers inside the report body โ€” `#` and `##` are ## Key Reminders -- โœ… **Import the right shared workflow** โ€” `charts-with-trending.md` or `trending-charts-simple.md` - โœ… **Check cache first** โ€” load historical data before collecting new data - โœ… **Append, never overwrite** โ€” use JSON Lines for time-series history - โœ… **External data files only** โ€” never inline data in Python - โœ… **Upload charts before reporting** โ€” collect all asset URLs, then create the issue/discussion - โœ… **Call `noop` if nothing to report** โ€” required when no safe-output action is taken - โœ… **Use DPI 300** and seaborn styling for publication-quality charts +- โœ… **90-day retention** โ€” prune history to prevent unbounded growth From 56c7835bee37792a8181a60ed104758f5ac8dee6 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Fri, 20 Mar 2026 22:26:33 +0000 Subject: [PATCH 5/7] =?UTF-8?q?chore:=20optimize=20data-science.md=20promp?= =?UTF-8?q?t=20=E2=80=94=2050%=20shorter,=20no=20redundancy?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/5198ff9c-8642-46a2-ae6b-2d9c299ae4a3 --- .github/aw/data-science.md | 277 +++++++++---------------------------- 1 file changed, 65 insertions(+), 212 deletions(-) diff --git a/.github/aw/data-science.md b/.github/aw/data-science.md index 46ed81f5b96..a6ca029ce10 100644 --- a/.github/aw/data-science.md +++ b/.github/aw/data-science.md @@ -4,7 +4,7 @@ description: Guidelines for creating agentic workflows that generate charts and # Data Science & Chart Generation -Consult this file when creating an agentic workflow that generates charts, visualizations, or trend analysis โ€” including data dashboards, metric reports, time-series plots, or any Python-based visualization output. +Use when creating a workflow that generates charts, trend visualizations, dashboards, or any Python-based metric output. ## Frontmatter Template @@ -13,15 +13,15 @@ Consult this file when creating an agentic workflow that generates charts, visua description: [what the workflow visualizes] on: schedule: - - cron: "0 9 * * 1" # example: every Monday at 09:00 UTC + - cron: "0 9 * * 1" # weekly; adjust as needed workflow_dispatch: permissions: contents: read - actions: read # add issues/discussions scopes when needed + actions: read engine: copilot tools: cache-memory: - key: trending-data-${{ github.workflow }}-${{ github.run_id }} + key: trending-${{ github.workflow }}-${{ github.run_id }} bash: - "*" network: @@ -30,265 +30,118 @@ network: - python safe-outputs: upload-asset: - create-issue: # or create-discussion for gallery-style reports + create-issue: # or create-discussion title-prefix: "๐Ÿ“Š [Report Name]:" labels: [report] close-older-issues: true expires: 30 steps: - - name: Setup Python environment + - name: setup run: | - mkdir -p /tmp/gh-aw/python/{data,charts,artifacts} + mkdir -p /tmp/gh-aw/python/{data,charts} mkdir -p /tmp/gh-aw/cache-memory/trending pip install --user --quiet numpy pandas matplotlib seaborn scipy - - name: Upload charts + - name: upload charts if: always() uses: actions/upload-artifact@v7.0.0 with: - name: data-charts + name: charts path: /tmp/gh-aw/python/charts/*.png if-no-files-found: warn retention-days: 30 - - name: Upload source files and data - if: always() - uses: actions/upload-artifact@v7.0.0 - with: - name: python-source-and-data - path: | - /tmp/gh-aw/python/*.py - /tmp/gh-aw/python/data/* - if-no-files-found: warn - retention-days: 30 --- ``` -## Environment - -| Location | Purpose | -|---|---| -| `/tmp/gh-aw/python/` | Working directory for scripts | -| `/tmp/gh-aw/python/data/` | Input data files (CSV, JSON) | -| `/tmp/gh-aw/python/charts/` | Generated chart images (PNG) | -| `/tmp/gh-aw/python/artifacts/` | Additional output files | -| `/tmp/gh-aw/cache-memory/trending/` | Persistent time-series history across runs | - -**Libraries**: NumPy ยท Pandas ยท Matplotlib ยท Seaborn ยท SciPy - -Charts and Python source files are automatically uploaded as GitHub Actions artifacts (`data-charts`, `python-source-and-data`, retention 30 days) so they are available even if the workflow fails. - -## Writing the Agent Prompt - -A well-structured prompt for a data-visualization workflow has these phases: - -### Phase 1 โ€“ Load historical data -```markdown -1. Check `/tmp/gh-aw/cache-memory/trending//history.jsonl` for existing data. -2. Load it into a Pandas DataFrame if it exists; otherwise start fresh. -``` - -### Phase 2 โ€“ Collect or generate data -```markdown -1. Collect today's metrics from the GitHub API (or generate sample data with NumPy). -2. Save raw data to `/tmp/gh-aw/python/data/.csv` or `.json` โ€” never inline data in Python code. -``` - -### Phase 3 โ€“ Append to history (JSON Lines) -```markdown -Append a new record to history.jsonl with ISO 8601 timestamp, metric name, value, and metadata. -Implement a 90-day retention policy to prevent unbounded growth. -``` - -### Phase 4 โ€“ Generate charts -```markdown -1. Create trend charts if โ‰ฅ 2 historical data points exist: - - Time-series line chart with 7-day moving average - - Comparative trend chart for multiple metrics -2. Fall back to bar/distribution charts when history is empty. -3. Save all charts to `/tmp/gh-aw/python/charts/` at DPI 300, seaborn style. -``` - -### Phase 5 โ€“ Upload and report -```markdown -1. Upload each chart using the `upload asset` tool. -2. Create an issue (or discussion) embedding the uploaded chart URLs in markdown. -3. If no meaningful data was found, call `noop` with a brief explanation. -``` - -## Data Rules - -**CRITICAL**: Data must never be inlined in Python code. Always write data to an external file first, then load it with pandas: - -```python -# โŒ PROHIBITED -data = [10, 20, 30, 40, 50] - -# โœ… REQUIRED -import pandas as pd -data = pd.read_csv('/tmp/gh-aw/python/data/metrics.csv') -``` +## Agent Prompt Structure -## Trending Patterns +Write the agent prompt as five ordered steps: -Cache-memory at `/tmp/gh-aw/cache-memory/trending/` persists across runs. Organize it as: +1. **Load history** โ€” read `/tmp/gh-aw/cache-memory/trending//history.jsonl` into a DataFrame if it exists; otherwise start empty. +2. **Collect data** โ€” fetch metrics from the GitHub API (or generate with NumPy). Save to `/tmp/gh-aw/python/data/.csv` โ€” **never inline data in Python code**. +3. **Append & prune** โ€” append a JSON Lines record `{"timestamp": "", "metric": "...", "value": ...}` to `history.jsonl`; drop records older than 90 days. +4. **Chart** โ€” if โ‰ฅ 2 history points exist, generate a time-series line chart with 7-day moving average; otherwise use a bar/distribution chart. Save to `/tmp/gh-aw/python/charts/` at DPI 300. +5. **Report** โ€” upload each chart with `upload asset`, then create an issue/discussion embedding the URLs. Call `noop` if there is nothing to report. -``` -/tmp/gh-aw/cache-memory/trending/ -โ”œโ”€โ”€ / -โ”‚ โ”œโ”€โ”€ history.jsonl # Time-series data (one JSON object per line) -โ”‚ โ”œโ”€โ”€ metadata.json # Data schema and description -โ”‚ โ””โ”€โ”€ last_updated.txt # Timestamp of last update -โ””โ”€โ”€ index.json # Index of all tracked metrics -``` +## Python Patterns -### Append a daily data point +### History: load โ†’ append โ†’ prune ```python -import json -from datetime import datetime - -point = { - "timestamp": datetime.now().isoformat(), - "metric": "issue_count", - "value": 42, - "metadata": {"source": "github_api"} -} -with open('/tmp/gh-aw/cache-memory/trending/issues/history.jsonl', 'a') as f: - f.write(json.dumps(point) + '\n') -``` - -### Load history into a DataFrame +import json, os, pandas as pd +from datetime import datetime, timedelta -```python -import pandas as pd -import os +HISTORY = '/tmp/gh-aw/cache-memory/trending/issues/history.jsonl' -history_file = '/tmp/gh-aw/cache-memory/trending/issues/history.jsonl' -if os.path.exists(history_file): - df = pd.read_json(history_file, lines=True) +# Load +df = pd.read_json(HISTORY, lines=True) if os.path.exists(HISTORY) else pd.DataFrame() +if not df.empty: df['timestamp'] = pd.to_datetime(df['timestamp']) df = df.sort_values('timestamp') -else: - df = pd.DataFrame() # Start fresh if no history -``` - -### Compute a 7-day moving average - -```python -df['rolling_avg'] = df['value'].rolling(window=7, min_periods=1).mean() - -fig, ax = plt.subplots(figsize=(12, 7), dpi=300) -ax.plot(df['timestamp'], df['value'], label='Actual', alpha=0.5, marker='o') -ax.plot(df['timestamp'], df['rolling_avg'], label='7-day Average', linewidth=2.5) -ax.fill_between(df['timestamp'], df['value'], df['rolling_avg'], alpha=0.2) -ax.legend(loc='best') -``` - -### Compare multiple metrics over time - -```python -fig, ax = plt.subplots(figsize=(14, 8), dpi=300) -for metric in ['metric_a', 'metric_b', 'metric_c']: - metric_data = df[df['metric'] == metric] - ax.plot(metric_data['timestamp'], metric_data['value'], - marker='o', label=metric, linewidth=2) -ax.set_title('Comparative Metrics Trends', fontsize=16, fontweight='bold') -ax.legend(loc='best', fontsize=12) -ax.grid(True, alpha=0.3) -``` - -### Enforce 90-day retention -```python -from datetime import timedelta +# Append +with open(HISTORY, 'a') as f: + f.write(json.dumps({"timestamp": datetime.now().isoformat(), "metric": "issue_count", "value": 42}) + '\n') -cutoff = pd.Timestamp.now() - timedelta(days=90) -df = df[df['timestamp'] >= cutoff] -df.to_json('/tmp/gh-aw/cache-memory/trending/issues/history.jsonl', - orient='records', lines=True) +# Prune to 90 days +if not df.empty: + df = df[df['timestamp'] >= pd.Timestamp.now() - timedelta(days=90)] + df.to_json(HISTORY, orient='records', lines=True) ``` -## Chart Quality Settings +### Chart: trend with moving average ```python import matplotlib.pyplot as plt import seaborn as sns -sns.set_style("whitegrid") -sns.set_palette("husl") - +sns.set_style("whitegrid"); sns.set_palette("husl") fig, ax = plt.subplots(figsize=(12, 7), dpi=300) -# ... plotting code ... -ax.set_title('Title', fontsize=16, fontweight='bold') -ax.set_xlabel('Date', fontsize=12) -ax.set_ylabel('Value', fontsize=12) -ax.grid(True, alpha=0.3) -plt.xticks(rotation=45) -plt.tight_layout() -plt.savefig('/tmp/gh-aw/python/charts/trend.png', - dpi=300, bbox_inches='tight', facecolor='white') -``` - -**Standards**: 300 DPI minimum ยท 12ร—7 inch figure ยท clear axis labels and title ยท legend for multi-series ยท grid lines enabled ยท colorblind-friendly palette (seaborn `husl` default) -## Including Charts in Reports - -Assets uploaded with the `upload asset` tool are published to an orphaned git branch and become URL-addressable after the workflow completes. +df['rolling'] = df['value'].rolling(window=7, min_periods=1).mean() +ax.plot(df['timestamp'], df['value'], label='Actual', alpha=0.5, marker='o') +ax.plot(df['timestamp'], df['rolling'], label='7-day avg', linewidth=2.5) +ax.fill_between(df['timestamp'], df['value'], df['rolling'], alpha=0.2) +ax.set_title('Metric Trend', fontsize=16, fontweight='bold') +ax.set_xlabel('Date', fontsize=12); ax.set_ylabel('Value', fontsize=12) +ax.legend(); ax.grid(True, alpha=0.3); plt.xticks(rotation=45); plt.tight_layout() +plt.savefig('/tmp/gh-aw/python/charts/trend.png', dpi=300, bbox_inches='tight', facecolor='white') +``` -```markdown -## Visualization Results +**Chart standards**: 300 DPI ยท 12ร—7 in ยท labeled axes and title ยท legend for multi-series ยท `husl` palette -![Chart description](https://raw.githubusercontent.com/owner/repo/assets/workflow-name/trend.png) +### Multiple metrics -The chart above shows... +```python +for metric in metrics: + sub = df[df['metric'] == metric] + ax.plot(sub['timestamp'], sub['value'], marker='o', label=metric, linewidth=2) ``` -## Report Structure - -When creating the issue or discussion, use this template: +## Report Template ```markdown -# ๐Ÿ“Š [Report Title] โ€” [Date] +# ๐Ÿ“Š [Title] โ€” [Date] ## Summary -[2โ€“3 sentences describing trends and key findings] +[2โ€“3 sentences on trends and key findings] -## [Metric 1] Trend -![Metric 1 trend chart](URL_FROM_UPLOAD_ASSET) -[Brief analysis: direction, moving average, notable events] - -## [Metric 2] Distribution -![Metric 2 chart](URL_FROM_UPLOAD_ASSET) -[Brief analysis] +### [Metric] Trend +![chart](URL_FROM_UPLOAD_ASSET) +[direction, moving average, notable events] ## Data Details -- **Source**: [GitHub API / generated sample / external] -- **Data points**: [count] -- **Date range**: [start] to [end] -- **Tracking period**: [N] days - -## Cache Status -- **Metrics tracked**: [list] -- **History location**: `/tmp/gh-aw/cache-memory/trending/` +- **Source**: โ€ฆ | **Points**: โ€ฆ | **Range**: โ€ฆ | **Period**: โ€ฆN days +- **Cache**: `/tmp/gh-aw/cache-memory/trending/` ``` -Use `###` and lower for all headers inside the report body โ€” `#` and `##` are reserved for issue/discussion titles. - -## Common Use Cases - -| Intent | Notes | -|---|---| -| "Create a weekly GitHub activity chart" | Schedule weekly; track issues, PRs, commits | -| "Visualize test coverage trends over time" | Trigger on push/PR; append per-run metrics | -| "Generate a dashboard of workflow run durations" | Schedule daily; use GitHub Actions API | -| "Plot stale repo aging distribution" | On-demand; no trending needed, skip cache | -| "Track contributor growth month-over-month" | Schedule monthly; long retention (365 days) | +Use `###` and deeper for all headers inside the report body. -## Key Reminders +## Use Cases -- โœ… **Check cache first** โ€” load historical data before collecting new data -- โœ… **Append, never overwrite** โ€” use JSON Lines for time-series history -- โœ… **External data files only** โ€” never inline data in Python -- โœ… **Upload charts before reporting** โ€” collect all asset URLs, then create the issue/discussion -- โœ… **Call `noop` if nothing to report** โ€” required when no safe-output action is taken -- โœ… **Use DPI 300** and seaborn styling for publication-quality charts -- โœ… **90-day retention** โ€” prune history to prevent unbounded growth +| Intent | Trigger | Notes | +|---|---|---| +| Weekly GitHub activity chart | `schedule` weekly | track issues, PRs, commits | +| Test coverage trends | `push`/`pull_request` | append per-run | +| Workflow run durations | `schedule` daily | GitHub Actions API | +| Stale repo aging distribution | `workflow_dispatch` | no cache needed | +| Contributor growth | `schedule` monthly | 365-day retention | From 8e971c5046dfa7dc8f3a0c0d51a825ae318e1bd2 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Fri, 20 Mar 2026 22:32:49 +0000 Subject: [PATCH 6/7] feat: advertise data-science.md in agentic-workflows.agent.md dispatcher Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/692b0ef5-d578-4a70-abbc-293f43df9399 --- .github/agents/agentic-workflows.agent.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/.github/agents/agentic-workflows.agent.md b/.github/agents/agentic-workflows.agent.md index c0f21877e1b..d4e69648476 100644 --- a/.github/agents/agentic-workflows.agent.md +++ b/.github/agents/agentic-workflows.agent.md @@ -19,6 +19,7 @@ This is a **dispatcher agent** that routes your request to the appropriate speci - **Creating shared components**: Routes to `create-shared-agentic-workflow` prompt - **Fixing Dependabot PRs**: Routes to `dependabot` prompt โ€” use this when Dependabot opens PRs that modify generated manifest files (`.github/workflows/package.json`, `.github/workflows/requirements.txt`, `.github/workflows/go.mod`). Never merge those PRs directly; instead update the source `.md` files and rerun `gh aw compile --dependabot` to bundle all fixes - **Analyzing test coverage**: Routes to `test-coverage` prompt โ€” consult this whenever the workflow reads, analyzes, or reports on test coverage data from PRs or CI runs +- **Data visualization / data science**: Routes to `data-science` prompt โ€” consult this whenever the workflow generates charts, trend plots, dashboards, or any Python-based metric visualization Workflows may optionally include: @@ -129,6 +130,18 @@ When you interact with this agent, it will: - "Analyze coverage trends over time" - "Add a coverage gate that blocks PRs below a threshold" +### Generate Charts and Trend Visualizations (Data Science) +**Load when**: The workflow generates charts, trend plots, dashboards, histograms, time-series graphs, or any Python-based data visualization โ€” including recurring metric reports with persistent historical data + +**Prompt file**: https://github.com/github/gh-aw/blob/main/.github/aw/data-science.md + +**Use cases**: +- "Create a workflow that charts weekly GitHub activity trends" +- "Visualize test coverage over time with moving averages" +- "Generate a monthly contributor growth dashboard" +- "Plot issue aging distribution and post it as a report" +- "Build a workflow run duration tracker with trend charts" + ## Instructions When a user interacts with you: From 067c52950c008f9b80c7987e6a895bce0eebe73d Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Fri, 20 Mar 2026 22:41:23 +0000 Subject: [PATCH 7/7] fix: trim non-data-science frontmatter fields and use ?raw=true URL in report template Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/d968a547-7632-453c-a0ac-0aea7e6add98 --- .github/agents/agentic-workflows.agent.md | 13 ------------ .github/aw/data-science.md | 24 +---------------------- 2 files changed, 1 insertion(+), 36 deletions(-) diff --git a/.github/agents/agentic-workflows.agent.md b/.github/agents/agentic-workflows.agent.md index d4e69648476..c0f21877e1b 100644 --- a/.github/agents/agentic-workflows.agent.md +++ b/.github/agents/agentic-workflows.agent.md @@ -19,7 +19,6 @@ This is a **dispatcher agent** that routes your request to the appropriate speci - **Creating shared components**: Routes to `create-shared-agentic-workflow` prompt - **Fixing Dependabot PRs**: Routes to `dependabot` prompt โ€” use this when Dependabot opens PRs that modify generated manifest files (`.github/workflows/package.json`, `.github/workflows/requirements.txt`, `.github/workflows/go.mod`). Never merge those PRs directly; instead update the source `.md` files and rerun `gh aw compile --dependabot` to bundle all fixes - **Analyzing test coverage**: Routes to `test-coverage` prompt โ€” consult this whenever the workflow reads, analyzes, or reports on test coverage data from PRs or CI runs -- **Data visualization / data science**: Routes to `data-science` prompt โ€” consult this whenever the workflow generates charts, trend plots, dashboards, or any Python-based metric visualization Workflows may optionally include: @@ -130,18 +129,6 @@ When you interact with this agent, it will: - "Analyze coverage trends over time" - "Add a coverage gate that blocks PRs below a threshold" -### Generate Charts and Trend Visualizations (Data Science) -**Load when**: The workflow generates charts, trend plots, dashboards, histograms, time-series graphs, or any Python-based data visualization โ€” including recurring metric reports with persistent historical data - -**Prompt file**: https://github.com/github/gh-aw/blob/main/.github/aw/data-science.md - -**Use cases**: -- "Create a workflow that charts weekly GitHub activity trends" -- "Visualize test coverage over time with moving averages" -- "Generate a monthly contributor growth dashboard" -- "Plot issue aging distribution and post it as a report" -- "Build a workflow run duration tracker with trend charts" - ## Instructions When a user interacts with you: diff --git a/.github/aw/data-science.md b/.github/aw/data-science.md index a6ca029ce10..7b30db5b2eb 100644 --- a/.github/aw/data-science.md +++ b/.github/aw/data-science.md @@ -10,20 +10,6 @@ Use when creating a workflow that generates charts, trend visualizations, dashbo ```yaml --- -description: [what the workflow visualizes] -on: - schedule: - - cron: "0 9 * * 1" # weekly; adjust as needed - workflow_dispatch: -permissions: - contents: read - actions: read -engine: copilot -tools: - cache-memory: - key: trending-${{ github.workflow }}-${{ github.run_id }} - bash: - - "*" network: allowed: - defaults @@ -41,14 +27,6 @@ steps: mkdir -p /tmp/gh-aw/python/{data,charts} mkdir -p /tmp/gh-aw/cache-memory/trending pip install --user --quiet numpy pandas matplotlib seaborn scipy - - name: upload charts - if: always() - uses: actions/upload-artifact@v7.0.0 - with: - name: charts - path: /tmp/gh-aw/python/charts/*.png - if-no-files-found: warn - retention-days: 30 --- ``` @@ -126,7 +104,7 @@ for metric in metrics: [2โ€“3 sentences on trends and key findings] ### [Metric] Trend -![chart](URL_FROM_UPLOAD_ASSET) +![chart](https://github.com/OWNER/REPO/blob/assets/WORKFLOW/chart.png?raw=true) [direction, moving average, notable events] ## Data Details