From 6f635530a311a8c1e3e4e97afa36b3d45d6c13f3 Mon Sep 17 00:00:00 2001 From: Lena Solarova Date: Thu, 23 Apr 2026 15:07:39 +0200 Subject: [PATCH 01/25] feat: add konflux-dep-bumps skill Skill for triaging and fixing failing Konflux/MintMaker dependency bump PRs. Covers the full workflow: getting the current open PR list, cheap fixes first (rebase/retest), root cause investigation, fix strategy selection, local verification, and opening fix PRs. Built from real triage experience across the RedHatInsights Go and Python repos, including patterns for shared library breakages, archived packages, Python dependency conflicts, and processing-tools hook regressions. Co-Authored-By: Claude Sonnet 4.6 (1M context) --- skills/konflux-dep-bumps/SKILL.md | 171 ++++++++++++++++++++++++++++++ 1 file changed, 171 insertions(+) create mode 100644 skills/konflux-dep-bumps/SKILL.md diff --git a/skills/konflux-dep-bumps/SKILL.md b/skills/konflux-dep-bumps/SKILL.md new file mode 100644 index 0000000..c46a81c --- /dev/null +++ b/skills/konflux-dep-bumps/SKILL.md @@ -0,0 +1,171 @@ +--- +name: konflux-dep-bumps +description: Triage and fix failing Konflux/MintMaker dependency bump PRs (bot-authored, auto-merge enabled). Use when a Renovate/MintMaker PR is stuck due to CI failures. Covers Go and Python repos in the RedHatInsights org. +--- + +# Konflux Dependency Bump Triage Skill + +MintMaker (Renovate via Konflux) opens dependency bump PRs with auto-merge enabled. When CI fails the PR stalls. This skill covers triage, investigation, and resolution. + +--- + +## Step 0 — Get the current open Konflux bot PRs + +**Source of truth:** https://github.com/RedHatInsights/processing-tools/tree/master/open_mr_pr/github + +Check the `open-prs-konflux.md` file there. The date at the top of that file tells you when it was last generated. **If the date does not match today's date, the file is stale — run the fetcher locally and inform the user before proceeding:** + +```bash +cd open_mr_pr/github # from the root of your processing-tools clone +python3 list_repos_prs.py +cat open-prs-konflux.md +``` + +--- + +## Step 1 — Check what is failing + +For each stuck PR: + +```bash +gh pr checks --repo RedHatInsights/ +``` + +Note every failing check. The most common are: Go tests, lint, BDD tests, Konflux pipeline, enterprise contract, artifact update (`renovate/artifacts`). Multiple failures often share a single root cause — fix the root and the rest clear. + +--- + +## Step 2 — Try the cheap fixes first + +Before investigating root cause, try these in order. They resolve a large portion of stuck PRs with no code change. + +**Rebase** — covers stale go.sum, go.mod drift, or Renovate artifact failures: + +```bash +gh pr comment --repo RedHatInsights/ --body "/rebase" +``` + +Or tick the rebase checkbox in the PR body — Renovate watches for it and re-runs with fresh artifacts. + +**Retest** — covers flaky or environment-dependent failures (infrastructure errors clearly unrelated to the dependency change, e.g. Kafka unreachable, OCM API client errors, DB dial failures). Check whether the Konflux pipeline itself passed even if GitHub Actions failed — that is a strong signal the failure is environmental: + +```bash +gh pr comment --repo RedHatInsights/ --body "/retest" +``` + +Note: use `/retest`, not `/ok-to-test` — the latter may not be wired up in all repos. + +If either of these resolves it, move on. If not, proceed to root cause investigation. + +--- + +## Step 3 — Read the logs and identify the root cause + +Pull the failed run logs: + +```bash +gh run view --repo RedHatInsights/ --log-failed 2>&1 | grep -E "undefined|cannot use|no field|incompatible|conflict|Error|FAILED" | head -40 +``` + +**Understand what broke before deciding how to fix it.** The PR description lists every bumped package with links to release notes — read them for the relevant version. Look specifically for breaking changes: removed fields, renamed types, changed function signatures, altered dependency requirements. + +Key questions to answer: +- Which bumped package introduced the breakage? +- Is it the bumped package itself that broke, or something that depends on it? +- Is the broken package archived / unmaintained? +- Does a newer compatible version of the affected package exist already? +- Is the breakage in this repo's own code, or in a shared library that this repo depends on? + +**The last question matters most for scoping the fix.** If the breakage originates in a shared library, fixing it there unblocks all downstream repos at once. Always check whether a direct dependency of the repo is pulling in the broken package transitively (`go mod graph`, `go.mod` indirect entries) before deciding where to fix. + +--- + +## Step 4 — Choose the right fix + +In order of preference: + +1. **Bump the affected package** to a version that is compatible with the newly bumped dependency — the cleanest outcome, keeps everything current. + +2. **Replace an archived or unmaintained package** with its maintained equivalent — required when the affected package will never release a fix. Verify the replacement on the Go module proxy or PyPI before committing to it. Check that the package is actually published and not just present in a GitHub repo. + +3. **Pin the breaking dependency back** to the last working version — last resort only, and only temporarily. Renovate will keep reopening the bump PR, so this is a holding pattern until option 1 or 2 becomes available. Document clearly in the PR why it is pinned. + +**Do not** create new source files or replace packages solely because a native alternative exists in another library. The reason to replace must be concrete: the package is archived, unmaintained, or structurally incompatible with no fix path. + +**Do not** apply the fix directly to repos that get the broken package transitively. Fix it at the source (the shared library), verify the downstream effect with a local `replace` directive, then open one PR instead of many. + +--- + +## Step 5 — Verify the fix before opening a PR + +**Always verify locally before pushing — no exceptions.** + +For a fix in a shared library, verify the downstream effect by pointing a dependent repo at the local fix using a `replace` directive: + +```bash +# In the downstream repo's go.mod, temporarily add: +replace github.com/RedHatInsights/ => /path/to/local/fix + +go mod tidy +go build ./... +go test ./... +``` + +If the broken package disappears from `go.mod` and all tests pass, the fix is correct. + +For any fix repo, always run all available tests locally: + +```bash +# Go +go build ./... && go test ./... +# Also check Makefile for additional targets (BDD, integration, e2e) +grep -E "^test|^bdd|^e2e|^integration" Makefile + +# Python +pip install -r requirements.txt && python -m pytest +``` + +**Do not push without local tests passing.** + +--- + +## Step 6 — Open the fix PR + +Fork the repo, create a branch from the upstream default branch, apply the fix, run tests, then open a PR. + +The PR description must include: +- What broke and why (cite the specific breaking change from release notes with a link) +- Why this fix is the right approach (not just what changed) +- A link to the Konflux bot PR it unblocks +- If fixing a shared library: note which downstream repos are affected + +After opening, comment on the stuck Konflux bot PR linking to the fix and noting it can be retested once the fix merges. + +--- + +## Step 7 — After the fix merges + +Renovate will rebase the bot PR automatically once the fix lands. If it does not rebase within a few hours, trigger it manually via the rebase checkbox or comment. For fixes in shared libraries, downstream bot PRs will not self-heal until Renovate opens a new bump that includes the updated shared library version — closing and recreating the bot PRs may be needed. + +--- + +## Triage standards + +- **One root cause can affect many repos.** Always check the full list of open Konflux PRs before starting — a pattern across repos points to a shared dependency or a shared library as the source. +- **Check the package's maintenance status** (archived? last commit date? open issues?) before deciding on a fix strategy. An archived package needs replacement; a recently released package may just need a version bump. +- **Check release dates.** If a breaking change was released very recently, downstream packages may not have had time to react yet. Document this and park the PR rather than applying a workaround. +- **The renovate.json in these repos is centrally managed** (synced from `processing-tools`) — do not edit it in downstream repos to work around dependency conflicts. Fix the conflict properly. +- **If a failure involves a processing-tools version bump** (pre-commit hooks, shared workflows, shared scripts), always ask the user before fixing it in the downstream repo. The right fix may be upstream in `processing-tools` itself — which is our repo and where the change should live. Fixing it downstream is a workaround; fixing it upstream unblocks all repos at once. +- **Never add files to repos unnecessarily.** The fix should be the minimum change that resolves the incompatibility — a go.mod update, an import swap, a version bump. Not a new source file unless genuinely required. + +--- + +## Common failure patterns + +| Symptom | Likely cause | Where to look | +|---------|-------------|---------------| +| Build fails with `undefined`, `no field`, `cannot use` on a transitive dep | Breaking API change in a bumped package used by a shared library | Check which shared library pulls in the broken package; fix there | +| Python `ResolutionImpossible` | Two bumped packages require incompatible ranges of a shared transitive dep | Read both packages' dependency specs; align the versions | +| `go: updates to go.mod needed` | Renovate artifact update failed; go.sum is stale | Rebase first; if that fails, run `go mod tidy` manually | +| All checks fail on a Go PR, Konflux pipeline also fails | Usually a compile error, not flaky tests | Read build logs, not test logs | +| GitHub Actions fail but Konflux pipeline passes | Environmental / infrastructure failure in GHA | `/retest` to retrigger | From b19f7c21249296e92e241e67ca20f3f65c1d832b Mon Sep 17 00:00:00 2001 From: Lena Solarova Date: Thu, 23 Apr 2026 15:30:28 +0200 Subject: [PATCH 02/25] =?UTF-8?q?fix:=20update=20rebase=20instructions=20?= =?UTF-8?q?=E2=80=94=20checkbox=20not=20/rebase=20comment?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- skills/konflux-dep-bumps/SKILL.md | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/skills/konflux-dep-bumps/SKILL.md b/skills/konflux-dep-bumps/SKILL.md index c46a81c..53c8882 100644 --- a/skills/konflux-dep-bumps/SKILL.md +++ b/skills/konflux-dep-bumps/SKILL.md @@ -16,7 +16,7 @@ MintMaker (Renovate via Konflux) opens dependency bump PRs with auto-merge enabl Check the `open-prs-konflux.md` file there. The date at the top of that file tells you when it was last generated. **If the date does not match today's date, the file is stale — run the fetcher locally and inform the user before proceeding:** ```bash -cd open_mr_pr/github # from the root of your processing-tools clone +cd /Users/lsolarov/Documents/processing-tools-gh/open_mr_pr/github python3 list_repos_prs.py cat open-prs-konflux.md ``` @@ -41,11 +41,16 @@ Before investigating root cause, try these in order. They resolve a large portio **Rebase** — covers stale go.sum, go.mod drift, or Renovate artifact failures: +Tick the rebase checkbox in the PR body — Renovate watches for it and re-runs with fresh artifacts: + ```bash -gh pr comment --repo RedHatInsights/ --body "/rebase" +# Get current PR body, flip the rebase checkbox, and update the PR +BODY=$(gh api repos/RedHatInsights//pulls/ --jq '.body') +NEWBODY=$(echo "$BODY" | sed 's/- \[ \] /- [x] /') +gh api repos/RedHatInsights//pulls/ --method PATCH --field body="$NEWBODY" ``` -Or tick the rebase checkbox in the PR body — Renovate watches for it and re-runs with fresh artifacts. +Note: `/rebase` as a comment does **not** work in these repos. The checkbox in the PR body is the correct trigger. If the checkbox has already been ticked and Renovate still hasn't rebased, push an empty commit to the repo's default branch to make the bot PR go behind by one commit — Renovate will then rebase it automatically on the next run. **Retest** — covers flaky or environment-dependent failures (infrastructure errors clearly unrelated to the dependency change, e.g. Kafka unreachable, OCM API client errors, DB dial failures). Check whether the Konflux pipeline itself passed even if GitHub Actions failed — that is a strong signal the failure is environmental: From 6f7680ab964770180f8a793bab197da0ed6dc78d Mon Sep 17 00:00:00 2001 From: Lena Solarova Date: Fri, 24 Apr 2026 11:32:26 +0200 Subject: [PATCH 03/25] docs: add CI/workflow gotchas from triage experience --- skills/konflux-dep-bumps/SKILL.md | 50 ++++++++++++++++++++++++++----- 1 file changed, 43 insertions(+), 7 deletions(-) diff --git a/skills/konflux-dep-bumps/SKILL.md b/skills/konflux-dep-bumps/SKILL.md index 53c8882..6077282 100644 --- a/skills/konflux-dep-bumps/SKILL.md +++ b/skills/konflux-dep-bumps/SKILL.md @@ -41,16 +41,17 @@ Before investigating root cause, try these in order. They resolve a large portio **Rebase** — covers stale go.sum, go.mod drift, or Renovate artifact failures: -Tick the rebase checkbox in the PR body — Renovate watches for it and re-runs with fresh artifacts: +Push an empty commit to the bot PR branch — this re-triggers CI and prompts Renovate to rebase with fresh artifacts: ```bash -# Get current PR body, flip the rebase checkbox, and update the PR -BODY=$(gh api repos/RedHatInsights//pulls/ --jq '.body') -NEWBODY=$(echo "$BODY" | sed 's/- \[ \] /- [x] /') -gh api repos/RedHatInsights//pulls/ --method PATCH --field body="$NEWBODY" +BRANCH=$(gh pr view --repo RedHatInsights/ --json headRefName --jq '.headRefName') +git fetch origin $BRANCH +git checkout $BRANCH +git commit --allow-empty -m "chore: trigger Renovate rebase" +git push origin $BRANCH ``` -Note: `/rebase` as a comment does **not** work in these repos. The checkbox in the PR body is the correct trigger. If the checkbox has already been ticked and Renovate still hasn't rebased, push an empty commit to the repo's default branch to make the bot PR go behind by one commit — Renovate will then rebase it automatically on the next run. +Note: `/rebase` as a comment does **not** work in these repos. **Retest** — covers flaky or environment-dependent failures (infrastructure errors clearly unrelated to the dependency change, e.g. Kafka unreachable, OCM API client errors, DB dial failures). Check whether the Konflux pipeline itself passed even if GitHub Actions failed — that is a strong signal the failure is environmental: @@ -154,6 +155,16 @@ Renovate will rebase the bot PR automatically once the fix lands. If it does not --- +## CI / workflow gotchas + +- **`gh run rerun` does not re-fetch reusable workflows.** When a workflow uses `uses: some-repo/.github/workflows/foo.yaml@master`, the `@master` SHA is resolved once when the run is first created and baked in. `gh run rerun` replays with that same SHA — even if master has since changed. To pick up a new workflow version, push a new commit to trigger a completely fresh run. + +- **Multiple PRs with the same root cause — fix them all at once.** When a wave of bot PRs hits with identical failures (e.g. missing go.sum entry across 5 repos), clone each branch, run `go mod tidy`, and push in one session rather than one by one. + +- **After your fix PR merges, bot PRs targeting the same files will have conflicts.** Tick the rebase checkbox to get Renovate to rebase. If the conflict is in go.mod/go.sum, Renovate will re-run artifact updates as part of the rebase. + +- **Coverage drop from removing covered code is not a regression to fix with arbitrary tests.** If you delete duplicated or misplaced code that happened to be well-covered, overall coverage may dip. The right response is to explain why to the team — not to add pointless tests just to hit a number. If coverage is enforced as a CI gate and blocking merges, consider making it non-blocking (`continue-on-error: true`) rather than gaming the percentage. + ## Triage standards - **One root cause can affect many repos.** Always check the full list of open Konflux PRs before starting — a pattern across repos points to a shared dependency or a shared library as the source. @@ -171,6 +182,31 @@ Renovate will rebase the bot PR automatically once the fix lands. If it does not |---------|-------------|---------------| | Build fails with `undefined`, `no field`, `cannot use` on a transitive dep | Breaking API change in a bumped package used by a shared library | Check which shared library pulls in the broken package; fix there | | Python `ResolutionImpossible` | Two bumped packages require incompatible ranges of a shared transitive dep | Read both packages' dependency specs; align the versions | -| `go: updates to go.mod needed` | Renovate artifact update failed; go.sum is stale | Rebase first; if that fails, run `go mod tidy` manually | +| `go: updates to go.mod needed` | Renovate artifact update failed; go.sum is stale — often caused by a bogus module path in go.mod | See below | | All checks fail on a Go PR, Konflux pipeline also fails | Usually a compile error, not flaky tests | Read build logs, not test logs | | GitHub Actions fail but Konflux pipeline passes | Environmental / infrastructure failure in GHA | `/retest` to retrigger | + +### Fixing a stale or broken go.mod on a bot PR branch + +When `go mod tidy` is needed and the rebase checkbox hasn't helped, fix it directly on the bot branch: + +```bash +BRANCH=$(gh pr view --repo RedHatInsights/ --json headRefName --jq '.headRefName') +git clone git@github.com:RedHatInsights/.git -fix +cd -fix +git fetch origin $BRANCH && git checkout $BRANCH +``` + +Check for obviously wrong entries — Renovate occasionally introduces bogus module paths (e.g. changing `github.com/foo/bar v1+incompatible` to `github.com/foo/bar/v3` when the real v3 is at a completely different path). If something looks wrong, remove it: + +```bash +sed -i '' '/bogus-module-path/d' go.mod +go mod tidy +go build ./... +go test ./... +git add go.mod go.sum +git commit -m "chore: fix go.mod and run go mod tidy" +git push origin $BRANCH +``` + +`go mod tidy` will restore any indirect dependency that is genuinely still needed (at the correct module path), and drop anything that isn't. Trust it. From a69384aca1d78538d97d84b5e0ec650f71279629 Mon Sep 17 00:00:00 2001 From: Lena Solarova Date: Tue, 5 May 2026 15:20:49 +0200 Subject: [PATCH 04/25] docs: document bonfire-tekton failure handling in konflux-dep-bumps skill MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds guidance for bonfire-tekton `deploy-application` failures that cannot be triaged programmatically — explains how to get task-level pass/fail via the check-runs API, why step logs require the Konflux UI, and adds the pattern to the known-failure quick-reference table. Co-Authored-By: Claude Sonnet 4.6 (1M context) --- skills/konflux-dep-bumps/SKILL.md | 24 +++++++++++++++++++++++- 1 file changed, 23 insertions(+), 1 deletion(-) diff --git a/skills/konflux-dep-bumps/SKILL.md b/skills/konflux-dep-bumps/SKILL.md index 6077282..27715a1 100644 --- a/skills/konflux-dep-bumps/SKILL.md +++ b/skills/konflux-dep-bumps/SKILL.md @@ -63,16 +63,37 @@ Note: use `/retest`, not `/ok-to-test` — the latter may not be wired up in all If either of these resolves it, move on. If not, proceed to root cause investigation. +**Exception — bonfire-tekton `deploy-application` failures:** these cannot be triaged with cheap fixes and cannot be triaged at all without the logs. Ask the user to open the Konflux UI link from the task table, navigate to the `deploy-application` step, and paste the relevant log snippet. The causes are varied — a missing image tag can result from a failed Konflux build pipeline, a component pointing at the wrong or outdated Quay repo, changes in multiple repos landing simultaneously, or a combination. Do not attempt to guess the root cause without the log. + --- ## Step 3 — Read the logs and identify the root cause -Pull the failed run logs: +**GitHub Actions failures** — use `gh run view`: ```bash gh run view --repo RedHatInsights/ --log-failed 2>&1 | grep -E "undefined|cannot use|no field|incompatible|conflict|Error|FAILED" | head -40 ``` +**Bonfire-tekton failures** — these are Tekton pipeline runs in Konflux, not GitHub Actions. `gh run view` does not work for them. The task-level pass/fail table is available via the GitHub check-runs API: + +```bash +SHA=$(gh pr view --repo RedHatInsights/ --json headRefOid --jq '.headRefOid') +gh api "repos/RedHatInsights//commits/$SHA/check-runs" | python3 -c " +import json, sys, re +data = json.load(sys.stdin) +for r in data.get('check_runs', []): + if 'bonfire' in r['name'] and r['conclusion'] == 'failure': + print(re.sub('<[^>]+>', ' ', r.get('output', {}).get('text', ''))) +" +``` + +This shows which Tekton steps failed (e.g. `reserve-namespace`, `deploy-application`, `teardown`) but **not the actual log output**. The step logs are only accessible in the Konflux UI (browser + RH SSO) while the pipeline run is still alive — once it completes, the pods are gone and logs are no longer reachable programmatically. + +**If you need to know the actual error message from a failed bonfire-tekton step, ask the user to either:** +- Open the Konflux UI link from the task table, navigate to the failed step, and paste the relevant log snippet +- Or download the log file from the Konflux UI and share it + **Understand what broke before deciding how to fix it.** The PR description lists every bumped package with links to release notes — read them for the relevant version. Look specifically for breaking changes: removed fields, renamed types, changed function signatures, altered dependency requirements. Key questions to answer: @@ -185,6 +206,7 @@ Renovate will rebase the bot PR automatically once the fix lands. If it does not | `go: updates to go.mod needed` | Renovate artifact update failed; go.sum is stale — often caused by a bogus module path in go.mod | See below | | All checks fail on a Go PR, Konflux pipeline also fails | Usually a compile error, not flaky tests | Read build logs, not test logs | | GitHub Actions fail but Konflux pipeline passes | Environmental / infrastructure failure in GHA | `/retest` to retrigger | +| Bonfire-tekton `deploy-application` fails | Cannot triage without the log — causes range from missing image tags (build pipeline failed, wrong Quay repo) to multi-repo interaction issues | Ask the user to paste the `deploy-application` log snippet from the Konflux UI | ### Fixing a stale or broken go.mod on a bot PR branch From 356ff30b71f741c07df28bf4771d5fa3cab18163 Mon Sep 17 00:00:00 2001 From: Lena Solarova Date: Wed, 6 May 2026 08:23:53 +0200 Subject: [PATCH 05/25] fix: add Go bin to PATH before running make test --- .github/workflows/gotests.yaml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/.github/workflows/gotests.yaml b/.github/workflows/gotests.yaml index 8286439..5f6fe39 100644 --- a/.github/workflows/gotests.yaml +++ b/.github/workflows/gotests.yaml @@ -39,6 +39,8 @@ jobs: name: Go tests steps: - uses: actions/checkout@v6 + - name: Add Go bin to PATH + run: echo "$(go env GOPATH)/bin" >> $GITHUB_PATH - name: Unit tests run: make test From a33372dc74295c84d1eaf8c5d1aa191cf48f0783 Mon Sep 17 00:00:00 2001 From: Lena Solarova Date: Wed, 6 May 2026 08:27:55 +0200 Subject: [PATCH 06/25] Revert "fix: add Go bin to PATH before running make test" This reverts commit 356ff30b71f741c07df28bf4771d5fa3cab18163. --- .github/workflows/gotests.yaml | 2 -- 1 file changed, 2 deletions(-) diff --git a/.github/workflows/gotests.yaml b/.github/workflows/gotests.yaml index 5f6fe39..8286439 100644 --- a/.github/workflows/gotests.yaml +++ b/.github/workflows/gotests.yaml @@ -39,8 +39,6 @@ jobs: name: Go tests steps: - uses: actions/checkout@v6 - - name: Add Go bin to PATH - run: echo "$(go env GOPATH)/bin" >> $GITHUB_PATH - name: Unit tests run: make test From f4879824a12e28902916dd2d82f990c95de67699 Mon Sep 17 00:00:00 2001 From: Lena Solarova Date: Wed, 6 May 2026 08:35:40 +0200 Subject: [PATCH 07/25] chore: document linter patterns and fix workflow learnings in skill --- skills/konflux-dep-bumps/SKILL.md | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/skills/konflux-dep-bumps/SKILL.md b/skills/konflux-dep-bumps/SKILL.md index 27715a1..b9f8c9f 100644 --- a/skills/konflux-dep-bumps/SKILL.md +++ b/skills/konflux-dep-bumps/SKILL.md @@ -232,3 +232,22 @@ git push origin $BRANCH ``` `go mod tidy` will restore any indirect dependency that is genuinely still needed (at the correct module path), and drop anything that isn't. Trust it. + +--- + +## Similar failures across a similar update + +When multiple repos fail on the same check after a similar bot PR (e.g. all failing after a shared tooling version bump), treat it as one investigation, not many. Pull the logs for a couple of repos and compare — if the root cause is the same, one fix approach applies to all. + +**Ask the user how to fix before acting.** A shared tooling bump is a processing-tools concern — confirm the approach with the user first. The correct fix is to address the underlying issue rather than suppress it in config. + +**When making the fix across repos:** +- Clone fresh to `/tmp` rather than fighting a local clone that may have a stale lock or uncommitted changes. +- Branch from upstream master/main, not from the bot PR branch. +- Run `go build ./...` and `go test ./...` locally before pushing — not just the build. +- When using `replace_all` to swap a string literal for a constant, the replacement will also hit the constant's own definition. Always verify the const declaration still has the string literal, not a self-reference. +- After code changes, run the linter locally before pushing — formatters can reformat code after a substitution and cause the pre-commit hook to report "files were modified". + +**Opening fix PRs:** one per repo, linked to the bot PR it unblocks, with a description explaining what changed and why the fix is correct. + +**Pre-existing CI failures.** Verify any failing check was already failing on the bot PR before assuming your change broke it. If it was — note it in your PR description and move on. From e25287e7e52f84073786fbb9f198a578d3f448d5 Mon Sep 17 00:00:00 2001 From: Lena Solarova Date: Wed, 6 May 2026 09:32:55 +0200 Subject: [PATCH 08/25] chore: trigger rebase From 94150c05b5aaaeb03734d38b6580a7c47cb4ea65 Mon Sep 17 00:00:00 2001 From: Lena Solarova Date: Wed, 6 May 2026 09:32:57 +0200 Subject: [PATCH 09/25] chore: trigger rebase From f3eb0d671cbcf56a958e1105c94f964f5753d82b Mon Sep 17 00:00:00 2001 From: Lena Solarova Date: Wed, 6 May 2026 09:32:59 +0200 Subject: [PATCH 10/25] chore: trigger rebase From c44af9cfc0264e2c92da085aaedff2b7f1c3032c Mon Sep 17 00:00:00 2001 From: Lena Solarova Date: Wed, 6 May 2026 09:51:42 +0200 Subject: [PATCH 11/25] chore: document linter version mismatch pitfall in skill --- skills/konflux-dep-bumps/SKILL.md | 1 + 1 file changed, 1 insertion(+) diff --git a/skills/konflux-dep-bumps/SKILL.md b/skills/konflux-dep-bumps/SKILL.md index b9f8c9f..ad9eea2 100644 --- a/skills/konflux-dep-bumps/SKILL.md +++ b/skills/konflux-dep-bumps/SKILL.md @@ -244,6 +244,7 @@ When multiple repos fail on the same check after a similar bot PR (e.g. all fail **When making the fix across repos:** - Clone fresh to `/tmp` rather than fighting a local clone that may have a stale lock or uncommitted changes. - Branch from upstream master/main, not from the bot PR branch. +- **Run the linter with the bumped version, not the current one on master.** Your fix PR runs lint against master's tooling version — if the bot PR bumps golangci-lint, your fix may pass locally and in CI while the bot PR still fails because the newer version finds additional violations. Check the bot PR's `.pre-commit-config.yaml` for the bumped version and run it locally against the full repo before opening the fix PR. - Run `go build ./...` and `go test ./...` locally before pushing — not just the build. - When using `replace_all` to swap a string literal for a constant, the replacement will also hit the constant's own definition. Always verify the const declaration still has the string literal, not a self-reference. - After code changes, run the linter locally before pushing — formatters can reformat code after a substitution and cause the pre-commit hook to report "files were modified". From 79b0c37f752d1bdba52baa03fda44cdc2c9e265f Mon Sep 17 00:00:00 2001 From: Lena Solarova Date: Wed, 6 May 2026 10:11:28 +0200 Subject: [PATCH 12/25] chore: clarify linter version check must happen before each commit --- skills/konflux-dep-bumps/SKILL.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/skills/konflux-dep-bumps/SKILL.md b/skills/konflux-dep-bumps/SKILL.md index ad9eea2..a232d7d 100644 --- a/skills/konflux-dep-bumps/SKILL.md +++ b/skills/konflux-dep-bumps/SKILL.md @@ -244,7 +244,7 @@ When multiple repos fail on the same check after a similar bot PR (e.g. all fail **When making the fix across repos:** - Clone fresh to `/tmp` rather than fighting a local clone that may have a stale lock or uncommitted changes. - Branch from upstream master/main, not from the bot PR branch. -- **Run the linter with the bumped version, not the current one on master.** Your fix PR runs lint against master's tooling version — if the bot PR bumps golangci-lint, your fix may pass locally and in CI while the bot PR still fails because the newer version finds additional violations. Check the bot PR's `.pre-commit-config.yaml` for the bumped version and run it locally against the full repo before opening the fix PR. +- **Run the linter with the bumped version, not the current one on master — before every commit.** Your fix PR runs lint against master's tooling version, so passing there means nothing for the bot PR. Check the bot PR's `.pre-commit-config.yaml` for the bumped version, install it, and run it against the full repo after each change. Only commit when that version reports no new violations. This avoids the back-and-forth where each fix exposes the next wave of truncated linter output. - Run `go build ./...` and `go test ./...` locally before pushing — not just the build. - When using `replace_all` to swap a string literal for a constant, the replacement will also hit the constant's own definition. Always verify the const declaration still has the string literal, not a self-reference. - After code changes, run the linter locally before pushing — formatters can reformat code after a substitution and cause the pre-commit hook to report "files were modified". From 10c724b4c94adf470a867b7808de1573cb9563bd Mon Sep 17 00:00:00 2001 From: Lena Solarova Date: Wed, 6 May 2026 17:08:49 +0200 Subject: [PATCH 13/25] chore: document go mod tidy as fix for broken go.sum on bot branches --- skills/konflux-dep-bumps/SKILL.md | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/skills/konflux-dep-bumps/SKILL.md b/skills/konflux-dep-bumps/SKILL.md index a232d7d..933e4b2 100644 --- a/skills/konflux-dep-bumps/SKILL.md +++ b/skills/konflux-dep-bumps/SKILL.md @@ -39,7 +39,7 @@ Note every failing check. The most common are: Go tests, lint, BDD tests, Konflu Before investigating root cause, try these in order. They resolve a large portion of stuck PRs with no code change. -**Rebase** — covers stale go.sum, go.mod drift, or Renovate artifact failures: +**Rebase** — covers go.mod drift or Renovate artifact failures where Renovate hasn't yet attempted to fix it: Push an empty commit to the bot PR branch — this re-triggers CI and prompts Renovate to rebase with fresh artifacts: @@ -53,6 +53,20 @@ git push origin $BRANCH Note: `/rebase` as a comment does **not** work in these repos. +**`go mod tidy` directly** — use this instead of an empty commit when Renovate already ran artifact updates but produced a broken go.sum (e.g. missing checksum entries for a major version bump). An empty commit will just re-run the same broken artifact update. Clone the bot branch, run `go mod tidy`, verify `go build ./...` passes, then push go.mod and go.sum directly to the bot branch: + +```bash +BRANCH=$(gh pr view --repo RedHatInsights/ --json headRefName --jq '.headRefName') +git clone git@github.com:RedHatInsights/.git /tmp/-fix +cd /tmp/-fix +git fetch origin $BRANCH && git checkout FETCH_HEAD -b fix-go-sum +go mod tidy +go build ./... +git add go.mod go.sum +git commit -m "chore: run go mod tidy to fix missing go.sum entries" +git push origin fix-go-sum:$BRANCH +``` + **Retest** — covers flaky or environment-dependent failures (infrastructure errors clearly unrelated to the dependency change, e.g. Kafka unreachable, OCM API client errors, DB dial failures). Check whether the Konflux pipeline itself passed even if GitHub Actions failed — that is a strong signal the failure is environmental: ```bash From 114c27d2486233955948f5a6bfb55bb50ba608f9 Mon Sep 17 00:00:00 2001 From: Lena Solarova Date: Wed, 6 May 2026 17:09:31 +0200 Subject: [PATCH 14/25] chore: suggest closing empty bot PRs after go mod tidy produces no diff --- skills/konflux-dep-bumps/SKILL.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/skills/konflux-dep-bumps/SKILL.md b/skills/konflux-dep-bumps/SKILL.md index 933e4b2..35182ac 100644 --- a/skills/konflux-dep-bumps/SKILL.md +++ b/skills/konflux-dep-bumps/SKILL.md @@ -67,6 +67,8 @@ git commit -m "chore: run go mod tidy to fix missing go.sum entries" git push origin fix-go-sum:$BRANCH ``` +If `go mod tidy` produces no diff at all (go.mod and go.sum unchanged), the bot PR has no real effect — the dependency is already correctly resolved. In that case, let the user know and suggest they close the PR: do not close it yourself. + **Retest** — covers flaky or environment-dependent failures (infrastructure errors clearly unrelated to the dependency change, e.g. Kafka unreachable, OCM API client errors, DB dial failures). Check whether the Konflux pipeline itself passed even if GitHub Actions failed — that is a strong signal the failure is environmental: ```bash From 938078e50d834aabc7b83f8ea9cf3fef71cd23bd Mon Sep 17 00:00:00 2001 From: Lena Solarova Date: Thu, 7 May 2026 08:18:06 +0200 Subject: [PATCH 15/25] chore: clarify merge vs close for empty bot PRs after go mod tidy --- skills/konflux-dep-bumps/SKILL.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/skills/konflux-dep-bumps/SKILL.md b/skills/konflux-dep-bumps/SKILL.md index 35182ac..0c95805 100644 --- a/skills/konflux-dep-bumps/SKILL.md +++ b/skills/konflux-dep-bumps/SKILL.md @@ -67,7 +67,9 @@ git commit -m "chore: run go mod tidy to fix missing go.sum entries" git push origin fix-go-sum:$BRANCH ``` -If `go mod tidy` produces no diff at all (go.mod and go.sum unchanged), the bot PR has no real effect — the dependency is already correctly resolved. In that case, let the user know and suggest they close the PR: do not close it yourself. +If `go mod tidy` reverts all of Renovate's changes (go.mod ends up the same as master, or the diff undoes what the bot introduced), the bot PR has no real effect. Let the user know the PR is empty and present the options — do not act yourself: +- **Let pipelines run and merge** — preferred, because merging signals to Renovate that the bump has been handled and prevents it from recreating the PR. +- **Close** — may cause Renovate to recreate the same PR, so only do this if you're sure the bump is invalid and you want to block it. **Retest** — covers flaky or environment-dependent failures (infrastructure errors clearly unrelated to the dependency change, e.g. Kafka unreachable, OCM API client errors, DB dial failures). Check whether the Konflux pipeline itself passed even if GitHub Actions failed — that is a strong signal the failure is environmental: From 2ef8ebd5a1b75525dffee493c0f77b9270dbbb74 Mon Sep 17 00:00:00 2001 From: Lena Solarova Date: Thu, 7 May 2026 10:29:22 +0200 Subject: [PATCH 16/25] chore: link Konflux navigation and debugging skills for pipeline log access --- skills/konflux-dep-bumps/SKILL.md | 22 +++++----------------- 1 file changed, 5 insertions(+), 17 deletions(-) diff --git a/skills/konflux-dep-bumps/SKILL.md b/skills/konflux-dep-bumps/SKILL.md index 0c95805..752aa07 100644 --- a/skills/konflux-dep-bumps/SKILL.md +++ b/skills/konflux-dep-bumps/SKILL.md @@ -81,7 +81,7 @@ Note: use `/retest`, not `/ok-to-test` — the latter may not be wired up in all If either of these resolves it, move on. If not, proceed to root cause investigation. -**Exception — bonfire-tekton `deploy-application` failures:** these cannot be triaged with cheap fixes and cannot be triaged at all without the logs. Ask the user to open the Konflux UI link from the task table, navigate to the `deploy-application` step, and paste the relevant log snippet. The causes are varied — a missing image tag can result from a failed Konflux build pipeline, a component pointing at the wrong or outdated Quay repo, changes in multiple repos landing simultaneously, or a combination. Do not attempt to guess the root cause without the log. +**Exception — bonfire-tekton `deploy-application` failures:** these cannot be triaged with cheap fixes and cannot be triaged without the logs. Use the Konflux navigation and debugging skills (linked in Step 3) to extract the PipelineRun URL and read the logs directly before asking the user for anything. The causes are varied — a missing image tag can result from a failed Konflux build pipeline, a component pointing at the wrong or outdated Quay repo, or changes in multiple repos landing simultaneously. --- @@ -93,24 +93,12 @@ If either of these resolves it, move on. If not, proceed to root cause investiga gh run view --repo RedHatInsights/ --log-failed 2>&1 | grep -E "undefined|cannot use|no field|incompatible|conflict|Error|FAILED" | head -40 ``` -**Bonfire-tekton failures** — these are Tekton pipeline runs in Konflux, not GitHub Actions. `gh run view` does not work for them. The task-level pass/fail table is available via the GitHub check-runs API: +**Konflux pipeline failures (bonfire-tekton, on-pull-request, or any check containing "konflux")** — for any Konflux failure you must always attempt to read the actual logs before asking the user. Use the two external skills linked below in sequence: -```bash -SHA=$(gh pr view --repo RedHatInsights/ --json headRefOid --jq '.headRefOid') -gh api "repos/RedHatInsights//commits/$SHA/check-runs" | python3 -c " -import json, sys, re -data = json.load(sys.stdin) -for r in data.get('check_runs', []): - if 'bonfire' in r['name'] and r['conclusion'] == 'failure': - print(re.sub('<[^>]+>', ' ', r.get('output', {}).get('text', ''))) -" -``` - -This shows which Tekton steps failed (e.g. `reserve-namespace`, `deploy-application`, `teardown`) but **not the actual log output**. The step logs are only accessible in the Konflux UI (browser + RH SSO) while the pipeline run is still alive — once it completes, the pods are gone and logs are no longer reachable programmatically. +1. **[navigating-github-to-konflux-pipelines](https://github.com/konflux-ci/skills/blob/main/skills/navigating-github-to-konflux-pipelines/SKILL.md)** — extracts the PipelineRun URL from the GitHub check run (via `gh api` check-runs, filtering for "konflux" in the check name), then parses the URL to get cluster, namespace, and pipelinerun name. +2. **[debugging-pipeline-failures](https://github.com/konflux-ci/skills/blob/main/skills/debugging-pipeline-failures/SKILL.md)** — uses `kubectl`/`oc` with the extracted cluster/namespace/pipelinerun to read logs, inspect TaskRun status, and identify the root cause. -**If you need to know the actual error message from a failed bonfire-tekton step, ask the user to either:** -- Open the Konflux UI link from the task table, navigate to the failed step, and paste the relevant log snippet -- Or download the log file from the Konflux UI and share it +Do not ask the user for logs or suggest they open a browser until you have attempted this yourself. Only fall back to asking the user if the cluster token is expired or the pipeline run has already completed and pods are gone. **Understand what broke before deciding how to fix it.** The PR description lists every bumped package with links to release notes — read them for the relevant version. Look specifically for breaking changes: removed fields, renamed types, changed function signatures, altered dependency requirements. From fbeb798372c37e300b33667123e4ca6ca436fcfd Mon Sep 17 00:00:00 2001 From: Lena Solarova Date: Thu, 7 May 2026 15:00:49 +0200 Subject: [PATCH 17/25] chore: try rebase before go mod tidy for broken go.sum --- skills/konflux-dep-bumps/SKILL.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/skills/konflux-dep-bumps/SKILL.md b/skills/konflux-dep-bumps/SKILL.md index 752aa07..cceeafc 100644 --- a/skills/konflux-dep-bumps/SKILL.md +++ b/skills/konflux-dep-bumps/SKILL.md @@ -53,7 +53,7 @@ git push origin $BRANCH Note: `/rebase` as a comment does **not** work in these repos. -**`go mod tidy` directly** — use this instead of an empty commit when Renovate already ran artifact updates but produced a broken go.sum (e.g. missing checksum entries for a major version bump). An empty commit will just re-run the same broken artifact update. Clone the bot branch, run `go mod tidy`, verify `go build ./...` passes, then push go.mod and go.sum directly to the bot branch: +**`go mod tidy` directly** — use this when Renovate already ran artifact updates but produced a broken go.sum (e.g. missing checksum entries for a major version bump). **Try the empty commit rebase first** — it may be enough if Renovate's artifact update just didn't run properly. Only move to `go mod tidy` if the rebase attempt doesn't fix it. Clone the bot branch, run `go mod tidy`, verify `go build ./...` passes, then push go.mod and go.sum directly to the bot branch: ```bash BRANCH=$(gh pr view --repo RedHatInsights/ --json headRefName --jq '.headRefName') From e89d912e58c1a787fe185651e78458208c327c89 Mon Sep 17 00:00:00 2001 From: Lena Solarova Date: Thu, 7 May 2026 15:02:26 +0200 Subject: [PATCH 18/25] chore: clarify oc logs only works while pipeline is running --- skills/konflux-dep-bumps/SKILL.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/skills/konflux-dep-bumps/SKILL.md b/skills/konflux-dep-bumps/SKILL.md index cceeafc..6102f75 100644 --- a/skills/konflux-dep-bumps/SKILL.md +++ b/skills/konflux-dep-bumps/SKILL.md @@ -98,7 +98,7 @@ gh run view --repo RedHatInsights/ --log-failed 2>&1 | grep -E "u 1. **[navigating-github-to-konflux-pipelines](https://github.com/konflux-ci/skills/blob/main/skills/navigating-github-to-konflux-pipelines/SKILL.md)** — extracts the PipelineRun URL from the GitHub check run (via `gh api` check-runs, filtering for "konflux" in the check name), then parses the URL to get cluster, namespace, and pipelinerun name. 2. **[debugging-pipeline-failures](https://github.com/konflux-ci/skills/blob/main/skills/debugging-pipeline-failures/SKILL.md)** — uses `kubectl`/`oc` with the extracted cluster/namespace/pipelinerun to read logs, inspect TaskRun status, and identify the root cause. -Do not ask the user for logs or suggest they open a browser until you have attempted this yourself. Only fall back to asking the user if the cluster token is expired or the pipeline run has already completed and pods are gone. +Attempt this yourself first. Note that `oc logs` only works while the pipeline is still running — once it completes the pods are gone. If the run has already finished, fall back to asking the user to paste the relevant log snippet from the Konflux UI. **Understand what broke before deciding how to fix it.** The PR description lists every bumped package with links to release notes — read them for the relevant version. Look specifically for breaking changes: removed fields, renamed types, changed function signatures, altered dependency requirements. From dbd137654527d65b95ad9166bd9b12ebc5b3452b Mon Sep 17 00:00:00 2001 From: Lena Solarova Date: Thu, 7 May 2026 15:04:09 +0200 Subject: [PATCH 19/25] chore: correct log availability - Konflux retains pods after completion --- skills/konflux-dep-bumps/SKILL.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/skills/konflux-dep-bumps/SKILL.md b/skills/konflux-dep-bumps/SKILL.md index 6102f75..9a850a2 100644 --- a/skills/konflux-dep-bumps/SKILL.md +++ b/skills/konflux-dep-bumps/SKILL.md @@ -98,7 +98,7 @@ gh run view --repo RedHatInsights/ --log-failed 2>&1 | grep -E "u 1. **[navigating-github-to-konflux-pipelines](https://github.com/konflux-ci/skills/blob/main/skills/navigating-github-to-konflux-pipelines/SKILL.md)** — extracts the PipelineRun URL from the GitHub check run (via `gh api` check-runs, filtering for "konflux" in the check name), then parses the URL to get cluster, namespace, and pipelinerun name. 2. **[debugging-pipeline-failures](https://github.com/konflux-ci/skills/blob/main/skills/debugging-pipeline-failures/SKILL.md)** — uses `kubectl`/`oc` with the extracted cluster/namespace/pipelinerun to read logs, inspect TaskRun status, and identify the root cause. -Attempt this yourself first. Note that `oc logs` only works while the pipeline is still running — once it completes the pods are gone. If the run has already finished, fall back to asking the user to paste the relevant log snippet from the Konflux UI. +Always attempt this yourself first. In Konflux, pipeline run pods are retained after completion so logs remain accessible via `kubectl`/`oc` even on finished runs. Only fall back to asking the user to paste a log snippet from the Konflux UI if the cluster token is expired or the pods have been pruned. **Understand what broke before deciding how to fix it.** The PR description lists every bumped package with links to release notes — read them for the relevant version. Look specifically for breaking changes: removed fields, renamed types, changed function signatures, altered dependency requirements. From 7fdc4ed301ad20c4c8b788d06f41bc8c43b20954 Mon Sep 17 00:00:00 2001 From: Lena Solarova Date: Thu, 7 May 2026 15:08:00 +0200 Subject: [PATCH 20/25] chore: rewrite step 4 to reflect actual practice --- skills/konflux-dep-bumps/SKILL.md | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/skills/konflux-dep-bumps/SKILL.md b/skills/konflux-dep-bumps/SKILL.md index 9a850a2..07a40a6 100644 --- a/skills/konflux-dep-bumps/SKILL.md +++ b/skills/konflux-dep-bumps/SKILL.md @@ -115,17 +115,13 @@ Key questions to answer: ## Step 4 — Choose the right fix -In order of preference: +The most common fixes are code changes in the repo itself (adding constants, fixing API usage, updating imports) or `go mod tidy` to fix artifacts. Beyond that: -1. **Bump the affected package** to a version that is compatible with the newly bumped dependency — the cleanest outcome, keeps everything current. +**If the breaking package is archived or unmaintained** — a replacement may be needed, but do not do this autonomously. Explain the situation to the user: what the package is, why it's unmaintained, and what the replacement candidate would be. This is a team decision. Wait for sign-off before touching anything. -2. **Replace an archived or unmaintained package** with its maintained equivalent — required when the affected package will never release a fix. Verify the replacement on the Go module proxy or PyPI before committing to it. Check that the package is actually published and not just present in a GitHub repo. +**If the PR is simply incompatible with no clear fix path** — do not pin the dependency back. Just leave the PR unmerged and let the user know. Pinning introduces technical debt and Renovate will keep reopening the PR anyway. -3. **Pin the breaking dependency back** to the last working version — last resort only, and only temporarily. Renovate will keep reopening the bump PR, so this is a holding pattern until option 1 or 2 becomes available. Document clearly in the PR why it is pinned. - -**Do not** create new source files or replace packages solely because a native alternative exists in another library. The reason to replace must be concrete: the package is archived, unmaintained, or structurally incompatible with no fix path. - -**Do not** apply the fix directly to repos that get the broken package transitively. Fix it at the source (the shared library), verify the downstream effect with a local `replace` directive, then open one PR instead of many. +**Do not** apply fixes directly to repos that get the broken package transitively. Fix it at the source (the shared library), verify the downstream effect with a local `replace` directive, then open one PR instead of many. --- From f2b240ebda0a0171d360f3b38d5d3bd9c1bd7609 Mon Sep 17 00:00:00 2001 From: Lena Solarova Date: Thu, 7 May 2026 15:09:04 +0200 Subject: [PATCH 21/25] chore: add linter to verify step --- skills/konflux-dep-bumps/SKILL.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/skills/konflux-dep-bumps/SKILL.md b/skills/konflux-dep-bumps/SKILL.md index 07a40a6..ff98a2f 100644 --- a/skills/konflux-dep-bumps/SKILL.md +++ b/skills/konflux-dep-bumps/SKILL.md @@ -142,7 +142,7 @@ go test ./... If the broken package disappears from `go.mod` and all tests pass, the fix is correct. -For any fix repo, always run all available tests locally: +For any fix repo, always run all available tests and the linter locally: ```bash # Go @@ -152,9 +152,12 @@ grep -E "^test|^bdd|^e2e|^integration" Makefile # Python pip install -r requirements.txt && python -m pytest + +# Linter — use the bumped version from the bot PR's .pre-commit-config.yaml, not the current one on master +golangci-lint run ./... ``` -**Do not push without local tests passing.** +**Do not push without local tests and linter passing.** --- From c8ba326b81602fa90abfde365922b64bbb5b0cc3 Mon Sep 17 00:00:00 2001 From: Lena Solarova Date: Thu, 7 May 2026 15:10:06 +0200 Subject: [PATCH 22/25] chore: fix step 7 - bot PRs don't auto-rebase, must be triggered manually --- skills/konflux-dep-bumps/SKILL.md | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/skills/konflux-dep-bumps/SKILL.md b/skills/konflux-dep-bumps/SKILL.md index ff98a2f..d464616 100644 --- a/skills/konflux-dep-bumps/SKILL.md +++ b/skills/konflux-dep-bumps/SKILL.md @@ -177,7 +177,20 @@ After opening, comment on the stuck Konflux bot PR linking to the fix and noting ## Step 7 — After the fix merges -Renovate will rebase the bot PR automatically once the fix lands. If it does not rebase within a few hours, trigger it manually via the rebase checkbox or comment. For fixes in shared libraries, downstream bot PRs will not self-heal until Renovate opens a new bump that includes the updated shared library version — closing and recreating the bot PRs may be needed. +Bot PRs do not auto-rebase when a fix lands. Trigger it manually using one of: + +```bash +# Preferred — updates the branch via GitHub API (merges master into the bot branch) +gh pr update-branch --repo RedHatInsights/ + +# Alternative — push an empty commit to the bot branch to retrigger CI +BRANCH=$(gh pr view --repo RedHatInsights/ --json headRefName --jq '.headRefName') +git fetch origin $BRANCH && git checkout $BRANCH +git commit --allow-empty -m "chore: trigger rebase" +git push origin $BRANCH +``` + +For fixes in shared libraries, downstream bot PRs will not pick up the fix until Renovate opens a new bump including the updated shared library version. --- From 714e2933fe1e20b11b0f896f3be17f26123978ea Mon Sep 17 00:00:00 2001 From: Lena Solarova Date: Thu, 7 May 2026 15:11:17 +0200 Subject: [PATCH 23/25] chore: broaden cross-repo pattern description to include same-issue-per-repo cases --- skills/konflux-dep-bumps/SKILL.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/skills/konflux-dep-bumps/SKILL.md b/skills/konflux-dep-bumps/SKILL.md index d464616..ac9c684 100644 --- a/skills/konflux-dep-bumps/SKILL.md +++ b/skills/konflux-dep-bumps/SKILL.md @@ -206,7 +206,7 @@ For fixes in shared libraries, downstream bot PRs will not pick up the fix until ## Triage standards -- **One root cause can affect many repos.** Always check the full list of open Konflux PRs before starting — a pattern across repos points to a shared dependency or a shared library as the source. +- **One root cause can affect many repos.** Always check the full list of open Konflux PRs before starting — a pattern across repos may point to a shared dependency or shared library as the source, or the same issue appearing independently in each repo due to the bumped package (e.g. a linter version bump introducing the same violation type across all repos). - **Check the package's maintenance status** (archived? last commit date? open issues?) before deciding on a fix strategy. An archived package needs replacement; a recently released package may just need a version bump. - **Check release dates.** If a breaking change was released very recently, downstream packages may not have had time to react yet. Document this and park the PR rather than applying a workaround. - **The renovate.json in these repos is centrally managed** (synced from `processing-tools`) — do not edit it in downstream repos to work around dependency conflicts. Fix the conflict properly. From e9a618a90b1d45efd562ee9505bd5c16296b5570 Mon Sep 17 00:00:00 2001 From: Lena Solarova Date: Thu, 7 May 2026 15:12:37 +0200 Subject: [PATCH 24/25] chore: remove common failure patterns table and duplicate go.mod fix section --- skills/konflux-dep-bumps/SKILL.md | 38 ------------------------------- 1 file changed, 38 deletions(-) diff --git a/skills/konflux-dep-bumps/SKILL.md b/skills/konflux-dep-bumps/SKILL.md index ac9c684..c208090 100644 --- a/skills/konflux-dep-bumps/SKILL.md +++ b/skills/konflux-dep-bumps/SKILL.md @@ -215,44 +215,6 @@ For fixes in shared libraries, downstream bot PRs will not pick up the fix until --- -## Common failure patterns - -| Symptom | Likely cause | Where to look | -|---------|-------------|---------------| -| Build fails with `undefined`, `no field`, `cannot use` on a transitive dep | Breaking API change in a bumped package used by a shared library | Check which shared library pulls in the broken package; fix there | -| Python `ResolutionImpossible` | Two bumped packages require incompatible ranges of a shared transitive dep | Read both packages' dependency specs; align the versions | -| `go: updates to go.mod needed` | Renovate artifact update failed; go.sum is stale — often caused by a bogus module path in go.mod | See below | -| All checks fail on a Go PR, Konflux pipeline also fails | Usually a compile error, not flaky tests | Read build logs, not test logs | -| GitHub Actions fail but Konflux pipeline passes | Environmental / infrastructure failure in GHA | `/retest` to retrigger | -| Bonfire-tekton `deploy-application` fails | Cannot triage without the log — causes range from missing image tags (build pipeline failed, wrong Quay repo) to multi-repo interaction issues | Ask the user to paste the `deploy-application` log snippet from the Konflux UI | - -### Fixing a stale or broken go.mod on a bot PR branch - -When `go mod tidy` is needed and the rebase checkbox hasn't helped, fix it directly on the bot branch: - -```bash -BRANCH=$(gh pr view --repo RedHatInsights/ --json headRefName --jq '.headRefName') -git clone git@github.com:RedHatInsights/.git -fix -cd -fix -git fetch origin $BRANCH && git checkout $BRANCH -``` - -Check for obviously wrong entries — Renovate occasionally introduces bogus module paths (e.g. changing `github.com/foo/bar v1+incompatible` to `github.com/foo/bar/v3` when the real v3 is at a completely different path). If something looks wrong, remove it: - -```bash -sed -i '' '/bogus-module-path/d' go.mod -go mod tidy -go build ./... -go test ./... -git add go.mod go.sum -git commit -m "chore: fix go.mod and run go mod tidy" -git push origin $BRANCH -``` - -`go mod tidy` will restore any indirect dependency that is genuinely still needed (at the correct module path), and drop anything that isn't. Trust it. - ---- - ## Similar failures across a similar update When multiple repos fail on the same check after a similar bot PR (e.g. all failing after a shared tooling version bump), treat it as one investigation, not many. Pull the logs for a couple of repos and compare — if the root cause is the same, one fix approach applies to all. From b0d0fd1188dee872afefe73d7693bf7b2e326ce4 Mon Sep 17 00:00:00 2001 From: Lena Solarova Date: Mon, 11 May 2026 13:39:32 +0200 Subject: [PATCH 25/25] chore: fix step 6 fork wording, CI gotcha, and maintenance status triage standard --- skills/konflux-dep-bumps/SKILL.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/skills/konflux-dep-bumps/SKILL.md b/skills/konflux-dep-bumps/SKILL.md index c208090..fe2e109 100644 --- a/skills/konflux-dep-bumps/SKILL.md +++ b/skills/konflux-dep-bumps/SKILL.md @@ -163,7 +163,7 @@ golangci-lint run ./... ## Step 6 — Open the fix PR -Fork the repo, create a branch from the upstream default branch, apply the fix, run tests, then open a PR. +Use your existing fork, create a branch from the upstream default branch, apply the fix, run tests, then open a PR. The PR description must include: - What broke and why (cite the specific breaking change from release notes with a link) @@ -200,14 +200,14 @@ For fixes in shared libraries, downstream bot PRs will not pick up the fix until - **Multiple PRs with the same root cause — fix them all at once.** When a wave of bot PRs hits with identical failures (e.g. missing go.sum entry across 5 repos), clone each branch, run `go mod tidy`, and push in one session rather than one by one. -- **After your fix PR merges, bot PRs targeting the same files will have conflicts.** Tick the rebase checkbox to get Renovate to rebase. If the conflict is in go.mod/go.sum, Renovate will re-run artifact updates as part of the rebase. +- **After your fix PR merges, update the bot PR branch** using `gh pr update-branch` or an empty commit (see Step 7). If the conflict is in go.mod/go.sum, run `go mod tidy` directly on the bot branch after updating it. - **Coverage drop from removing covered code is not a regression to fix with arbitrary tests.** If you delete duplicated or misplaced code that happened to be well-covered, overall coverage may dip. The right response is to explain why to the team — not to add pointless tests just to hit a number. If coverage is enforced as a CI gate and blocking merges, consider making it non-blocking (`continue-on-error: true`) rather than gaming the percentage. ## Triage standards - **One root cause can affect many repos.** Always check the full list of open Konflux PRs before starting — a pattern across repos may point to a shared dependency or shared library as the source, or the same issue appearing independently in each repo due to the bumped package (e.g. a linter version bump introducing the same violation type across all repos). -- **Check the package's maintenance status** (archived? last commit date? open issues?) before deciding on a fix strategy. An archived package needs replacement; a recently released package may just need a version bump. +- **Check the package's maintenance status** (archived? last commit date? open issues?) before deciding on a fix strategy. An archived package needs replacement — always raise this with the user before acting. - **Check release dates.** If a breaking change was released very recently, downstream packages may not have had time to react yet. Document this and park the PR rather than applying a workaround. - **The renovate.json in these repos is centrally managed** (synced from `processing-tools`) — do not edit it in downstream repos to work around dependency conflicts. Fix the conflict properly. - **If a failure involves a processing-tools version bump** (pre-commit hooks, shared workflows, shared scripts), always ask the user before fixing it in the downstream repo. The right fix may be upstream in `processing-tools` itself — which is our repo and where the change should live. Fixing it downstream is a workaround; fixing it upstream unblocks all repos at once.