Add log-searching-agent workflow with inline preflight search#548
Add log-searching-agent workflow with inline preflight search#548
Conversation
New workflow that downloads GitHub Actions logs for a specified workflow, searches them for caller-supplied terms (comma-separated exact matches), and hands the search results to an AI agent for pattern analysis and triage issue creation. All preflight logic is inline in the workflow frontmatter steps to support workflow_call without external script references. Co-authored-by: strawgate <6384545+strawgate@users.noreply.github.com>
📝 WalkthroughWalkthroughThis pull request introduces a Log Searching Agent workflow system for GitHub Actions that downloads and searches workflow logs for specified terms within a configurable time window, then analyzes results using an AI agent to identify patterns and file triage issues. The implementation comprises a reusable workflow definition with multi-stage execution (pre-activation, activation, agent, safe outputs, conclusion), a Markdown specification, documentation, an example workflow file, and updates to the dogfood exclusion script. Suggested labels
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
|
|
||
| ## Safe Outputs | ||
|
|
||
| - `create-issue` — file a log search investigation report (max 1, auto-closes older reports) |
There was a problem hiding this comment.
[LOW] Safe output behavior description does not match workflow config
This README says create-issue will "auto-close older reports", but the workflow config sets close-older-issues: false (in .github/workflows/gh-aw-log-searching-agent.md). That mismatch can mislead users about issue lifecycle behavior.
| - `create-issue` — file a log search investigation report (max 1, auto-closes older reports) | |
| - `create-issue` — file a log search investigation report (max 1) |
|
@copilot Will be feasible to execute this log-searching-agent over all the defined workflows in the repository? |
…rom main) Co-authored-by: fr4nc1sc0-r4m0n <215478872+fr4nc1sc0-r4m0n@users.noreply.github.com>
Yes — the caller's trigger workflow can use a GitHub Actions matrix strategy to fan out over all (or a subset of) workflows. For example: jobs:
discover:
runs-on: ubuntu-latest
outputs:
workflows: ${{ steps.list.outputs.workflows }}
steps:
- id: list
run: |
workflows=$(gh api repos/${{ github.repository }}/actions/workflows \
--jq '[.workflows[].path | split("/")[-1]] | tojson')
echo "workflows=$workflows" >> "$GITHUB_OUTPUT"
env:
GH_TOKEN: ${{ github.token }}
search:
needs: discover
strategy:
matrix:
workflow: ${{ fromJSON(needs.discover.outputs.workflows) }}
uses: elastic/ai-github-actions/.github/workflows/gh-aw-log-searching-agent.lock.yml@v0
with:
workflow: ${{ matrix.workflow }}
search-terms: "OutOfMemoryError, DEADLINE_EXCEEDED"
secrets:
COPILOT_GITHUB_TOKEN: ${{ secrets.COPILOT_GITHUB_TOKEN }}Each matrix leg runs the preflight search + agent independently, so you get a separate triage issue per workflow when actionable findings exist. The |
|
@strawgate Do you think we can merge this? |
Pull Request is not mergeable
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/gh-aw-log-searching-agent.lock.yml:
- Around line 16-20: The committed lockfile
.github/workflows/gh-aw-log-searching-agent.lock.yml is out of sync with its
source .github/workflows/gh-aw-log-searching-agent.md; recompile the generated
workflow and update the lockfile by running the generator (gh aw compile or make
compile) on gh-aw-log-searching-agent.md, verify the resulting .lock.yml matches
the compiled output, and commit the updated
.github/workflows/gh-aw-log-searching-agent.lock.yml to unblock CI.
- Line 593: The manifest's runs_searched uses COLLECTED which counts selected
runs but not whether logs were successfully downloaded/extracted; change the
logic to track only successfully processed runs: introduce a new counter (e.g.
PROCESSED_RUNS=0), increment PROCESSED_RUNS only after a successful gh api
download and successful unzip (inside the loop after extraction, near the
existing unzip success path), use PROCESSED_RUNS in the manifest "runs_searched"
field and in the final summary echoes, and leave COLLECTED for limiting how many
run IDs are selected; alternatively, if you prefer to keep COLLECTED as the
processed count, decrement COLLECTED or avoid incrementing it until after
successful download/extract (update references to COLLECTED in
messages/manifests accordingly).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d033038f-3611-4cf5-9e5b-adaad3ee4423
📒 Files selected for processing (2)
.github/workflows/gh-aw-log-searching-agent.lock.ymlscripts/dogfood.sh
🚧 Files skipped from review as they are similar to previous changes (1)
- scripts/dogfood.sh
| # This file was automatically generated by gh-aw. DO NOT EDIT. | ||
| # | ||
| # To update this file, edit the corresponding .md file and run: | ||
| # gh aw compile | ||
| # Not all edits will cause changes to this file. |
There was a problem hiding this comment.
Re-sync generated lockfile to unblock CI
Internal: CI reports this file is modified after make compile, so the committed lock output is out of sync with its .md source. Please regenerate from .github/workflows/gh-aw-log-searching-agent.md (gh aw compile / make compile) and commit the updated lockfile.
Based on learnings, .lock.yml files are generated artifacts and must match the compiled output from their corresponding .md source.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/gh-aw-log-searching-agent.lock.yml around lines 16 - 20,
The committed lockfile .github/workflows/gh-aw-log-searching-agent.lock.yml is
out of sync with its source .github/workflows/gh-aw-log-searching-agent.md;
recompile the generated workflow and update the lockfile by running the
generator (gh aw compile or make compile) on gh-aw-log-searching-agent.md,
verify the resulting .lock.yml matches the compiled output, and commit the
updated .github/workflows/gh-aw-log-searching-agent.lock.yml to unblock CI.
| SEARCH_TERMS: ${{ inputs.search-terms }} | ||
| SEARCH_WORKFLOW: ${{ inputs.workflow }} | ||
| name: Search workflow logs | ||
| run: "set -euo pipefail\n\nLOGS_DIR=\"/tmp/gh-aw/logs\"\nRESULTS_DIR=\"/tmp/gh-aw/search-results\"\nmkdir -p \"$LOGS_DIR\" \"$RESULTS_DIR\"\n\n# Compute the since date (N days ago)\nSINCE_DATE=$(date -u -d \"$SEARCH_DAYS days ago\" +%Y-%m-%dT00:00:00Z 2>/dev/null \\\n || date -u -v-\"${SEARCH_DAYS}\"d +%Y-%m-%dT00:00:00Z)\n\necho \"Fetching up to $SEARCH_MAX_RUNS runs of '$SEARCH_WORKFLOW' since $SINCE_DATE (conclusion: $SEARCH_CONCLUSION)...\"\n\n# List matching workflow runs\nPAGE=1\nCOLLECTED=0\nRUN_IDS=\"\"\nwhile [ \"$COLLECTED\" -lt \"$SEARCH_MAX_RUNS\" ]; do\n RESPONSE=$(gh api \"repos/$GITHUB_REPOSITORY/actions/workflows/$SEARCH_WORKFLOW/runs?per_page=100&page=$PAGE&created=>=$SINCE_DATE\" --jq '.workflow_runs')\n COUNT=$(echo \"$RESPONSE\" | jq 'length')\n [ \"$COUNT\" -gt 0 ] || break\n\n for i in $(seq 0 $((COUNT - 1))); do\n RUN_CONCLUSION=$(echo \"$RESPONSE\" | jq -r \".[$i].conclusion\")\n RUN_ID=$(echo \"$RESPONSE\" | jq -r \".[$i].id\")\n\n if [ \"$SEARCH_CONCLUSION\" = \"any\" ] || [ \"$RUN_CONCLUSION\" = \"$SEARCH_CONCLUSION\" ]; then\n RUN_IDS=\"$RUN_IDS $RUN_ID\"\n COLLECTED=$((COLLECTED + 1))\n [ \"$COLLECTED\" -lt \"$SEARCH_MAX_RUNS\" ] || break\n fi\n done\n PAGE=$((PAGE + 1))\ndone\n\nif [ -z \"$RUN_IDS\" ]; then\n echo \"No matching workflow runs found.\"\n echo '{\"workflow\":\"'\"$SEARCH_WORKFLOW\"'\",\"search_terms\":[],\"since\":\"'\"$SINCE_DATE\"'\",\"runs_searched\":0,\"total_matches\":0,\"results\":[]}' > \"$RESULTS_DIR/search-manifest.json\"\n exit 0\nfi\n\necho \"Found $COLLECTED matching run(s). Downloading and searching logs...\"\n\n# Parse search terms (comma-separated) into an array\nIFS=',' read -ra TERMS <<< \"$SEARCH_TERMS\"\n# Trim whitespace from each term\nCLEAN_TERMS=()\nfor term in \"${TERMS[@]}\"; do\n cleaned=$(echo \"$term\" | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')\n [ -n \"$cleaned\" ] && CLEAN_TERMS+=(\"$cleaned\")\ndone\n\nMANIFEST_ENTRIES=\"\"\nTOTAL_MATCHES=0\n\nfor RUN_ID in $RUN_IDS; do\n RUN_DIR=\"$LOGS_DIR/$RUN_ID\"\n mkdir -p \"$RUN_DIR\"\n\n # Download and extract logs\n echo \" Downloading logs for run $RUN_ID...\"\n if ! gh api \"repos/$GITHUB_REPOSITORY/actions/runs/$RUN_ID/logs\" > \"$RUN_DIR/logs.zip\" 2>/dev/null; then\n echo \" Warning: could not download logs for run $RUN_ID\"\n continue\n fi\n\n if ! unzip -q -o \"$RUN_DIR/logs.zip\" -d \"$RUN_DIR/\" 2>/dev/null; then\n echo \" Warning: could not extract logs for run $RUN_ID\"\n continue\n fi\n rm -f \"$RUN_DIR/logs.zip\"\n\n # Search each log file for each term\n RUN_MATCHES=\"\"\n RUN_MATCH_COUNT=0\n while IFS= read -r -d '' logfile; do\n for term in \"${CLEAN_TERMS[@]}\"; do\n MATCHES=$(grep -n -F \"$term\" \"$logfile\" 2>/dev/null || true)\n if [ -n \"$MATCHES\" ]; then\n MATCH_COUNT=$(echo \"$MATCHES\" | wc -l)\n RUN_MATCH_COUNT=$((RUN_MATCH_COUNT + MATCH_COUNT))\n RELATIVE_PATH=\"${logfile#$LOGS_DIR/}\"\n\n # Write matched lines with context to a result file\n RESULT_FILE=\"$RESULTS_DIR/run-${RUN_ID}-matches.txt\"\n {\n echo \"=== File: $RELATIVE_PATH | Term: $term | Matches: $MATCH_COUNT ===\"\n grep -n -F -B2 -A2 \"$term\" \"$logfile\" 2>/dev/null || true\n echo \"\"\n } >> \"$RESULT_FILE\"\n\n # Build JSON entry for this file match\n ESCAPED_PATH=$(echo \"$RELATIVE_PATH\" | jq -Rs '.')\n ESCAPED_TERM=$(echo \"$term\" | jq -Rs '.')\n RUN_MATCHES=\"$RUN_MATCHES{\\\"file\\\":$ESCAPED_PATH,\\\"term\\\":$ESCAPED_TERM,\\\"count\\\":$MATCH_COUNT},\"\n fi\n done\n done < <(find \"$RUN_DIR\" -name \"*.txt\" -print0)\n\n if [ \"$RUN_MATCH_COUNT\" -gt 0 ]; then\n TOTAL_MATCHES=$((TOTAL_MATCHES + RUN_MATCH_COUNT))\n RUN_MATCHES=\"${RUN_MATCHES%,}\"\n RUN_URL=\"https://github.com/$GITHUB_REPOSITORY/actions/runs/$RUN_ID\"\n MANIFEST_ENTRIES=\"$MANIFEST_ENTRIES{\\\"run_id\\\":$RUN_ID,\\\"url\\\":\\\"$RUN_URL\\\",\\\"match_count\\\":$RUN_MATCH_COUNT,\\\"file_matches\\\":[$RUN_MATCHES]},\"\n echo \" Run $RUN_ID: $RUN_MATCH_COUNT match(es)\"\n else\n echo \" Run $RUN_ID: no matches\"\n # Clean up log files for runs with no matches to save disk\n rm -rf \"$RUN_DIR\"\n fi\ndone\n\n# Build search terms JSON array\nTERMS_JSON=\"[\"\nfor term in \"${CLEAN_TERMS[@]}\"; do\n ESCAPED=$(echo \"$term\" | jq -Rs '.')\n TERMS_JSON=\"$TERMS_JSON$ESCAPED,\"\ndone\nTERMS_JSON=\"${TERMS_JSON%,}]\"\n\n# Write the search manifest\nMANIFEST_ENTRIES=\"${MANIFEST_ENTRIES%,}\"\ncat > \"$RESULTS_DIR/search-manifest.json\" <<MANIFEST_EOF\n{\n \"workflow\": \"$SEARCH_WORKFLOW\",\n \"search_terms\": $TERMS_JSON,\n \"since\": \"$SINCE_DATE\",\n \"conclusion_filter\": \"$SEARCH_CONCLUSION\",\n \"runs_searched\": $COLLECTED,\n \"total_matches\": $TOTAL_MATCHES,\n \"results\": [$MANIFEST_ENTRIES]\n}\nMANIFEST_EOF\n\necho \"\"\necho \"Search complete. Total matches: $TOTAL_MATCHES across $COLLECTED run(s).\"\necho \"Results written to $RESULTS_DIR/\"\n" |
There was a problem hiding this comment.
runs_searched can be incorrect when run log download/extract fails
The script counts a run as searched when it is selected, but not when logs are actually downloaded/extracted. If gh api .../logs or unzip fails, the manifest still reports it in runs_searched, which can understate false negatives in analysis.
Suggested fix (in source .md, then recompile)
- COLLECTED=0
+ COLLECTED=0
+ RUNS_SEARCHED=0
+ RUNS_FAILED=0
for RUN_ID in $RUN_IDS; do
...
if ! gh api "repos/$GITHUB_REPOSITORY/actions/runs/$RUN_ID/logs" > "$RUN_DIR/logs.zip" 2>/dev/null; then
+ RUNS_FAILED=$((RUNS_FAILED + 1))
continue
fi
if ! unzip -q -o "$RUN_DIR/logs.zip" -d "$RUN_DIR/" 2>/dev/null; then
+ RUNS_FAILED=$((RUNS_FAILED + 1))
continue
fi
+ RUNS_SEARCHED=$((RUNS_SEARCHED + 1))
...
done
- "runs_searched": $COLLECTED,
+ "runs_searched": $RUNS_SEARCHED,
+ "runs_failed": $RUNS_FAILED,📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| run: "set -euo pipefail\n\nLOGS_DIR=\"/tmp/gh-aw/logs\"\nRESULTS_DIR=\"/tmp/gh-aw/search-results\"\nmkdir -p \"$LOGS_DIR\" \"$RESULTS_DIR\"\n\n# Compute the since date (N days ago)\nSINCE_DATE=$(date -u -d \"$SEARCH_DAYS days ago\" +%Y-%m-%dT00:00:00Z 2>/dev/null \\\n || date -u -v-\"${SEARCH_DAYS}\"d +%Y-%m-%dT00:00:00Z)\n\necho \"Fetching up to $SEARCH_MAX_RUNS runs of '$SEARCH_WORKFLOW' since $SINCE_DATE (conclusion: $SEARCH_CONCLUSION)...\"\n\n# List matching workflow runs\nPAGE=1\nCOLLECTED=0\nRUN_IDS=\"\"\nwhile [ \"$COLLECTED\" -lt \"$SEARCH_MAX_RUNS\" ]; do\n RESPONSE=$(gh api \"repos/$GITHUB_REPOSITORY/actions/workflows/$SEARCH_WORKFLOW/runs?per_page=100&page=$PAGE&created=>=$SINCE_DATE\" --jq '.workflow_runs')\n COUNT=$(echo \"$RESPONSE\" | jq 'length')\n [ \"$COUNT\" -gt 0 ] || break\n\n for i in $(seq 0 $((COUNT - 1))); do\n RUN_CONCLUSION=$(echo \"$RESPONSE\" | jq -r \".[$i].conclusion\")\n RUN_ID=$(echo \"$RESPONSE\" | jq -r \".[$i].id\")\n\n if [ \"$SEARCH_CONCLUSION\" = \"any\" ] || [ \"$RUN_CONCLUSION\" = \"$SEARCH_CONCLUSION\" ]; then\n RUN_IDS=\"$RUN_IDS $RUN_ID\"\n COLLECTED=$((COLLECTED + 1))\n [ \"$COLLECTED\" -lt \"$SEARCH_MAX_RUNS\" ] || break\n fi\n done\n PAGE=$((PAGE + 1))\ndone\n\nif [ -z \"$RUN_IDS\" ]; then\n echo \"No matching workflow runs found.\"\n echo '{\"workflow\":\"'\"$SEARCH_WORKFLOW\"'\",\"search_terms\":[],\"since\":\"'\"$SINCE_DATE\"'\",\"runs_searched\":0,\"total_matches\":0,\"results\":[]}' > \"$RESULTS_DIR/search-manifest.json\"\n exit 0\nfi\n\necho \"Found $COLLECTED matching run(s). Downloading and searching logs...\"\n\n# Parse search terms (comma-separated) into an array\nIFS=',' read -ra TERMS <<< \"$SEARCH_TERMS\"\n# Trim whitespace from each term\nCLEAN_TERMS=()\nfor term in \"${TERMS[@]}\"; do\n cleaned=$(echo \"$term\" | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')\n [ -n \"$cleaned\" ] && CLEAN_TERMS+=(\"$cleaned\")\ndone\n\nMANIFEST_ENTRIES=\"\"\nTOTAL_MATCHES=0\n\nfor RUN_ID in $RUN_IDS; do\n RUN_DIR=\"$LOGS_DIR/$RUN_ID\"\n mkdir -p \"$RUN_DIR\"\n\n # Download and extract logs\n echo \" Downloading logs for run $RUN_ID...\"\n if ! gh api \"repos/$GITHUB_REPOSITORY/actions/runs/$RUN_ID/logs\" > \"$RUN_DIR/logs.zip\" 2>/dev/null; then\n echo \" Warning: could not download logs for run $RUN_ID\"\n continue\n fi\n\n if ! unzip -q -o \"$RUN_DIR/logs.zip\" -d \"$RUN_DIR/\" 2>/dev/null; then\n echo \" Warning: could not extract logs for run $RUN_ID\"\n continue\n fi\n rm -f \"$RUN_DIR/logs.zip\"\n\n # Search each log file for each term\n RUN_MATCHES=\"\"\n RUN_MATCH_COUNT=0\n while IFS= read -r -d '' logfile; do\n for term in \"${CLEAN_TERMS[@]}\"; do\n MATCHES=$(grep -n -F \"$term\" \"$logfile\" 2>/dev/null || true)\n if [ -n \"$MATCHES\" ]; then\n MATCH_COUNT=$(echo \"$MATCHES\" | wc -l)\n RUN_MATCH_COUNT=$((RUN_MATCH_COUNT + MATCH_COUNT))\n RELATIVE_PATH=\"${logfile#$LOGS_DIR/}\"\n\n # Write matched lines with context to a result file\n RESULT_FILE=\"$RESULTS_DIR/run-${RUN_ID}-matches.txt\"\n {\n echo \"=== File: $RELATIVE_PATH | Term: $term | Matches: $MATCH_COUNT ===\"\n grep -n -F -B2 -A2 \"$term\" \"$logfile\" 2>/dev/null || true\n echo \"\"\n } >> \"$RESULT_FILE\"\n\n # Build JSON entry for this file match\n ESCAPED_PATH=$(echo \"$RELATIVE_PATH\" | jq -Rs '.')\n ESCAPED_TERM=$(echo \"$term\" | jq -Rs '.')\n RUN_MATCHES=\"$RUN_MATCHES{\\\"file\\\":$ESCAPED_PATH,\\\"term\\\":$ESCAPED_TERM,\\\"count\\\":$MATCH_COUNT},\"\n fi\n done\n done < <(find \"$RUN_DIR\" -name \"*.txt\" -print0)\n\n if [ \"$RUN_MATCH_COUNT\" -gt 0 ]; then\n TOTAL_MATCHES=$((TOTAL_MATCHES + RUN_MATCH_COUNT))\n RUN_MATCHES=\"${RUN_MATCHES%,}\"\n RUN_URL=\"https://github.com/$GITHUB_REPOSITORY/actions/runs/$RUN_ID\"\n MANIFEST_ENTRIES=\"$MANIFEST_ENTRIES{\\\"run_id\\\":$RUN_ID,\\\"url\\\":\\\"$RUN_URL\\\",\\\"match_count\\\":$RUN_MATCH_COUNT,\\\"file_matches\\\":[$RUN_MATCHES]},\"\n echo \" Run $RUN_ID: $RUN_MATCH_COUNT match(es)\"\n else\n echo \" Run $RUN_ID: no matches\"\n # Clean up log files for runs with no matches to save disk\n rm -rf \"$RUN_DIR\"\n fi\ndone\n\n# Build search terms JSON array\nTERMS_JSON=\"[\"\nfor term in \"${CLEAN_TERMS[@]}\"; do\n ESCAPED=$(echo \"$term\" | jq -Rs '.')\n TERMS_JSON=\"$TERMS_JSON$ESCAPED,\"\ndone\nTERMS_JSON=\"${TERMS_JSON%,}]\"\n\n# Write the search manifest\nMANIFEST_ENTRIES=\"${MANIFEST_ENTRIES%,}\"\ncat > \"$RESULTS_DIR/search-manifest.json\" <<MANIFEST_EOF\n{\n \"workflow\": \"$SEARCH_WORKFLOW\",\n \"search_terms\": $TERMS_JSON,\n \"since\": \"$SINCE_DATE\",\n \"conclusion_filter\": \"$SEARCH_CONCLUSION\",\n \"runs_searched\": $COLLECTED,\n \"total_matches\": $TOTAL_MATCHES,\n \"results\": [$MANIFEST_ENTRIES]\n}\nMANIFEST_EOF\n\necho \"\"\necho \"Search complete. Total matches: $TOTAL_MATCHES across $COLLECTED run(s).\"\necho \"Results written to $RESULTS_DIR/\"\n" | |
| run: "set -euo pipefail\n\nLOGS_DIR=\"/tmp/gh-aw/logs\"\nRESULTS_DIR=\"/tmp/gh-aw/search-results\"\nmkdir -p \"$LOGS_DIR\" \"$RESULTS_DIR\"\n\n# Compute the since date (N days ago)\nSINCE_DATE=$(date -u -d \"$SEARCH_DAYS days ago\" +%Y-%m-%dT00:00:00Z 2>/dev/null \\\n || date -u -v-\"${SEARCH_DAYS}\"d +%Y-%m-%dT00:00:00Z)\n\necho \"Fetching up to $SEARCH_MAX_RUNS runs of '$SEARCH_WORKFLOW' since $SINCE_DATE (conclusion: $SEARCH_CONCLUSION)...\"\n\n# List matching workflow runs\nPAGE=1\nCOLLECTED=0\nRUNS_SEARCHED=0\nRUNS_FAILED=0\nRUN_IDS=\"\"\nwhile [ \"$COLLECTED\" -lt \"$SEARCH_MAX_RUNS\" ]; do\n RESPONSE=$(gh api \"repos/$GITHUB_REPOSITORY/actions/workflows/$SEARCH_WORKFLOW/runs?per_page=100&page=$PAGE&created=>=$SINCE_DATE\" --jq '.workflow_runs')\n COUNT=$(echo \"$RESPONSE\" | jq 'length')\n [ \"$COUNT\" -gt 0 ] || break\n\n for i in $(seq 0 $((COUNT - 1))); do\n RUN_CONCLUSION=$(echo \"$RESPONSE\" | jq -r \".[$i].conclusion\")\n RUN_ID=$(echo \"$RESPONSE\" | jq -r \".[$i].id\")\n\n if [ \"$SEARCH_CONCLUSION\" = \"any\" ] || [ \"$RUN_CONCLUSION\" = \"$SEARCH_CONCLUSION\" ]; then\n RUN_IDS=\"$RUN_IDS $RUN_ID\"\n COLLECTED=$((COLLECTED + 1))\n [ \"$COLLECTED\" -lt \"$SEARCH_MAX_RUNS\" ] || break\n fi\n done\n PAGE=$((PAGE + 1))\ndone\n\nif [ -z \"$RUN_IDS\" ]; then\n echo \"No matching workflow runs found.\"\n echo '{\"workflow\":\"'\"$SEARCH_WORKFLOW\"'\",\"search_terms\":[],\"since\":\"'\"$SINCE_DATE\"'\",\"runs_searched\":0,\"total_matches\":0,\"results\":[]}' > \"$RESULTS_DIR/search-manifest.json\"\n exit 0\nfi\n\necho \"Found $COLLECTED matching run(s). Downloading and searching logs...\"\n\n# Parse search terms (comma-separated) into an array\nIFS=',' read -ra TERMS <<< \"$SEARCH_TERMS\"\n# Trim whitespace from each term\nCLEAN_TERMS=()\nfor term in \"${TERMS[@]}\"; do\n cleaned=$(echo \"$term\" | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')\n [ -n \"$cleaned\" ] && CLEAN_TERMS+=(\"$cleaned\")\ndone\n\nMANIFEST_ENTRIES=\"\"\nTOTAL_MATCHES=0\n\nfor RUN_ID in $RUN_IDS; do\n RUN_DIR=\"$LOGS_DIR/$RUN_ID\"\n mkdir -p \"$RUN_DIR\"\n\n # Download and extract logs\n echo \" Downloading logs for run $RUN_ID...\"\n if ! gh api \"repos/$GITHUB_REPOSITORY/actions/runs/$RUN_ID/logs\" > \"$RUN_DIR/logs.zip\" 2>/dev/null; then\n echo \" Warning: could not download logs for run $RUN_ID\"\n RUNS_FAILED=$((RUNS_FAILED + 1))\n continue\n fi\n\n if ! unzip -q -o \"$RUN_DIR/logs.zip\" -d \"$RUN_DIR/\" 2>/dev/null; then\n echo \" Warning: could not extract logs for run $RUN_ID\"\n RUNS_FAILED=$((RUNS_FAILED + 1))\n continue\n fi\n RUNS_SEARCHED=$((RUNS_SEARCHED + 1))\n rm -f \"$RUN_DIR/logs.zip\"\n\n # Search each log file for each term\n RUN_MATCHES=\"\"\n RUN_MATCH_COUNT=0\n while IFS= read -r -d '' logfile; do\n for term in \"${CLEAN_TERMS[@]}\"; do\n MATCHES=$(grep -n -F \"$term\" \"$logfile\" 2>/dev/null || true)\n if [ -n \"$MATCHES\" ]; then\n MATCH_COUNT=$(echo \"$MATCHES\" | wc -l)\n RUN_MATCH_COUNT=$((RUN_MATCH_COUNT + MATCH_COUNT))\n RELATIVE_PATH=\"${logfile#$LOGS_DIR/}\"\n\n # Write matched lines with context to a result file\n RESULT_FILE=\"$RESULTS_DIR/run-${RUN_ID}-matches.txt\"\n {\n echo \"=== File: $RELATIVE_PATH | Term: $term | Matches: $MATCH_COUNT ===\"\n grep -n -F -B2 -A2 \"$term\" \"$logfile\" 2>/dev/null || true\n echo \"\"\n } >> \"$RESULT_FILE\"\n\n # Build JSON entry for this file match\n ESCAPED_PATH=$(echo \"$RELATIVE_PATH\" | jq -Rs '.')\n ESCAPED_TERM=$(echo \"$term\" | jq -Rs '.')\n RUN_MATCHES=\"$RUN_MATCHES{\\\"file\\\":$ESCAPED_PATH,\\\"term\\\":$ESCAPED_TERM,\\\"count\\\":$MATCH_COUNT},\"\n fi\n done\n done < <(find \"$RUN_DIR\" -name \"*.txt\" -print0)\n\n if [ \"$RUN_MATCH_COUNT\" -gt 0 ]; then\n TOTAL_MATCHES=$((TOTAL_MATCHES + RUN_MATCH_COUNT))\n RUN_MATCHES=\"${RUN_MATCHES%,}\"\n RUN_URL=\"https://github.com/$GITHUB_REPOSITORY/actions/runs/$RUN_ID\"\n MANIFEST_ENTRIES=\"$MANIFEST_ENTRIES{\\\"run_id\\\":$RUN_ID,\\\"url\\\":\\\"$RUN_URL\\\",\\\"match_count\\\":$RUN_MATCH_COUNT,\\\"file_matches\\\":[$RUN_MATCHES]},\"\n echo \" Run $RUN_ID: $RUN_MATCH_COUNT match(es)\"\n else\n echo \" Run $RUN_ID: no matches\"\n # Clean up log files for runs with no matches to save disk\n rm -rf \"$RUN_DIR\"\n fi\ndone\n\n# Build search terms JSON array\nTERMS_JSON=\"[\"\nfor term in \"${CLEAN_TERMS[@]}\"; do\n ESCAPED=$(echo \"$term\" | jq -Rs '.')\n TERMS_JSON=\"$TERMS_JSON$ESCAPED,\"\ndone\nTERMS_JSON=\"${TERMS_JSON%,}]\"\n\n# Write the search manifest\nMANIFEST_ENTRIES=\"${MANIFEST_ENTRIES%,}\"\ncat > \"$RESULTS_DIR/search-manifest.json\" <<MANIFEST_EOF\n{\n \"workflow\": \"$SEARCH_WORKFLOW\",\n \"search_terms\": $TERMS_JSON,\n \"since\": \"$SINCE_DATE\",\n \"conclusion_filter\": \"$SEARCH_CONCLUSION\",\n \"runs_searched\": $RUNS_SEARCHED,\n \"runs_failed\": $RUNS_FAILED,\n \"total_matches\": $TOTAL_MATCHES,\n \"results\": [$MANIFEST_ENTRIES]\n}\nMANIFEST_EOF\n\necho \"\"\necho \"Search complete. Total matches: $TOTAL_MATCHES across $COLLECTED run(s).\"\necho \"Results written to $RESULTS_DIR/\"\n" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/gh-aw-log-searching-agent.lock.yml at line 593, The
manifest's runs_searched uses COLLECTED which counts selected runs but not
whether logs were successfully downloaded/extracted; change the logic to track
only successfully processed runs: introduce a new counter (e.g.
PROCESSED_RUNS=0), increment PROCESSED_RUNS only after a successful gh api
download and successful unzip (inside the loop after extraction, near the
existing unzip success path), use PROCESSED_RUNS in the manifest "runs_searched"
field and in the final summary echoes, and leave COLLECTED for limiting how many
run IDs are selected; alternatively, if you prefer to keep COLLECTED as the
processed count, decrement COLLECTED or avoid incrementing it until after
successful download/extract (update references to COLLECTED in
messages/manifests accordingly).
Summary
Adds a new reusable log-searching agent workflow that performs an inline preflight search over GitHub Actions logs, then hands structured results to the agent for investigation and issue creation when findings are actionable.
Changes in this PR
.github/workflows/gh-aw-log-searching-agent.md.workflow_callinputs for workflow targeting, exact-match search terms, lookback window, run limits, conclusion filtering, and optional setup/instruction controls.search-manifest.jsonunder/tmp/gh-aw/search-results.gh-agent-workflows/log-searching-agent/example.ymlwith aworkflow_dispatchtrigger and required inputs.gh-agent-workflows/log-searching-agent/README.mddocumenting behavior, inputs, and safe-output behavior.scripts/dogfood.shto excludelog-searching-agent..github/workflows/gh-aw-log-searching-agent.lock.ymland recompiled it to pick up themessages-footerformat change frommain.The body of this PR is automatically managed by the Trigger Update PR Body workflow.