Skip to content

Conversation

@dgoodwin
Copy link
Contributor

@dgoodwin dgoodwin commented Nov 28, 2025

Starting to drift to a claude code focus for some of our tooling to help QSEs and others interpret and work with regressions and PRs.

The first pulls back all failures for a given PR, across both presubmits and /payload jobs.

Example:

curl "http://127.0.0.1:8080/api/pull_requests/test_results?org=openshift&repo=origin&pr_number=30654&start_date=2026-01-06&end_date=2026-01-19"
[
  {
    "prowjob_build_id": "2013067627370385408",
    "prowjob_name": "openshift-origin-30654-nightly-4.21-e2e-vsphere-ovn-techpreview-serial",
    "prowjob_url": "https://prow.ci.openshift.org/view/gs/test-platform-results/logs/openshift-origin-30654-nightly-4.21-e2e-vsphere-ovn-techpreview-serial/2013067627370385408",
    "pr_sha": "8deb29e6c5ffa7de214caac7c659575f502111bf",
    "prowjob_start": "2026-01-19T01:54:57Z",
    "test_name": "[Monitor:legacy-test-framework-invariants-pathological][sig-arch] events should not repeat pathologically",
    "test_suite": "openshift-tests",
    "success": false,
    "flaked": false,
    "failure_content": "4 events happened too frequently\n\nevent happened 21 times, something is wrong: node/ci-op-2bn7n163-32075-kwbbl-master-0 hmsg/f99f0db431 - reason/AddSigtermProtection Adding SIGTERM protection (06:16:55Z) result=reject \nevent happened 21 times, something is wrong: node/ci-op-2bn7n163-32075-kwbbl-master-1 hmsg/f99f0db431 - reason/AddSigtermProtection Adding SIGTERM protection (06:17:13Z) result=reject \nevent happened 23 times, something is wrong: node/ci-op-2bn7n163-32075-kwbbl-worker-0-vzklw hmsg/f99f0db431 - reason/AddSigtermProtection Adding SIGTERM protection (06:17:15Z) result=reject \nevent happened 21 times, something is wrong: node/ci-op-2bn7n163-32075-kwbbl-master-2 hmsg/f99f0db431 - reason/AddSigtermProtection Adding SIGTERM protection (06:17:30Z) result=reject"
  },
  {
    "prowjob_build_id": "2013067628293132288",
    "prowjob_name": "openshift-origin-30654-nightly-4.21-e2e-vsphere-ovn-techpreview-serial",
    "prowjob_url": "https://prow.ci.openshift.org/view/gs/test-platform-results/logs/openshift-origin-30654-nightly-4.21-e2e-vsphere-ovn-techpreview-serial/2013067628293132288",
    "pr_sha": "8deb29e6c5ffa7de214caac7c659575f502111bf",
    "prowjob_start": "2026-01-19T01:54:57Z",
    "test_name": "[Monitor:legacy-test-framework-invariants-pathological][sig-arch] events should not repeat pathologically",
    "test_suite": "openshift-tests",
    "success": false,
    "flaked": false,
    "failure_content": "4 events happened too frequently\n\nevent happened 21 times, something is wrong: node/ci-op-2bn7n163-c3221-nkfbd-master-2 hmsg/f99f0db431 - reason/AddSigtermProtection Adding SIGTERM protection (06:25:36Z) result=reject \nevent happened 21 times, something is wrong: node/ci-op-2bn7n163-c3221-nkfbd-master-1 hmsg/f99f0db431 - reason/AddSigtermProtection Adding SIGTERM protection (06:24:59Z) result=reject \nevent happened 21 times, something is wrong: node/ci-op-2bn7n163-c3221-nkfbd-master-0 hmsg/f99f0db431 - reason/AddSigtermProtection Adding SIGTERM protection (06:25:17Z) result=reject \nevent happened 23 times, something is wrong: node/ci-op-2bn7n163-c3221-nkfbd-worker-0-m2wpd hmsg/f99f0db431 - reason/AddSigtermProtection Adding SIGTERM protection (06:25:18Z) result=reject"
  },
...

Goal is to have AI tooling scan the list of failures looking for patterns, optionally with a new test name it detected to make sure it was running successfully.

The second is a new version of the test outputs scanner using a test ID from CR, and a list of job runs. This will be used to see if a test is failing the same way in all failures in a test details report.

Example:

curl "http://127.0.0.1:8080/api/tests/v2/outputs?test_id=openshift-tests:30ac23dcd7037e581ed41a85fad97ecf&prow_job_run_ids=2013519360701763584,2013362530763673600,2013242091152347136,2013111257472700416,2013028524092297216"
[
  {
    "url": "https://prow.ci.openshift.org/view/gs/test-platform-results/logs/2013242091152347136",
    "output": "1 events happened too frequently\n\nevent happened 33 times, something is wrong: namespace/openshift-cluster-csi-drivers deployment/vmware-vsphere-csi-driver-operator hmsg/f08cbd1e38 - reason/DaemonSetUpdated Updated DaemonSet.apps/vmware-vsphere-csi-driver-node -n openshift-cluster-csi-drivers because it changed (16:59:53Z) result=reject",
    "test_name": "[Monitor:legacy-test-framework-invariants-pathological][sig-arch] events should not repeat pathologically for ns/openshift-cluster-csi-drivers"
  },
  {
    "url": "https://prow.ci.openshift.org/view/gs/test-platform-results/logs/2013519360701763584",
    "output": "4 events happened too frequently\n\nevent happened 65 times, something is wrong: namespace/openshift-cluster-csi-drivers deployment/vmware-vsphere-csi-driver-operator hmsg/4ef95af7b6 - reason/SecretUpdated Updated Secret/vsphere-csi-config-secret -n openshift-cluster-csi-drivers because it changed (11:09:57Z) result=reject \nevent happened 49 times, something is wrong: namespace/openshift-cluster-csi-drivers deployment/vmware-vsphere-csi-driver-operator hmsg/5fc8006e70 - reason/DeploymentUpdated Updated Deployment.apps/vmware-vsphere-csi-driver-controller -n openshift-cluster-csi-drivers because it changed (11:09:57Z) result=reject \nevent happened 58 times, something is wrong: namespace/openshift-cluster-csi-drivers deployment/vmware-vsphere-csi-driver-operator hmsg/f08cbd1e38 - reason/DaemonSetUpdated Updated DaemonSet.apps/vmware-vsphere-csi-driver-node -n openshift-cluster-csi-drivers because it changed (11:10:10Z) result=reject \nevent happened 26 times, something is wrong: namespace/openshift-cluster-csi-drivers deployment/vmware-vsphere-csi-driver-controller hmsg/33b34fff13 - reason/ScalingReplicaSet (combined from similar events): Scaled down replica set vmware-vsphere-csi-driver-controller-6c5f784757 from 2 to 1 (11:09:57Z) result=reject",
    "test_name": "[Monitor:legacy-test-framework-invariants-pathological][sig-arch] events should not repeat pathologically for ns/openshift-cluster-csi-drivers"
  },
  {
    "url": "https://prow.ci.openshift.org/view/gs/test-platform-results/logs/2013111257472700416",
    "output": "3 events happened too frequently\n\nevent happened 31 times, something is wrong: namespace/openshift-cluster-csi-drivers deployment/vmware-vsphere-csi-driver-operator hmsg/4ef95af7b6 - reason/SecretUpdated Updated Secret/vsphere-csi-config-secret -n openshift-cluster-csi-drivers because it changed (08:03:00Z) result=reject \nevent happened 35 times, something is wrong: namespace/openshift-cluster-csi-drivers deployment/vmware-vsphere-csi-driver-operator hmsg/f08cbd1e38 - reason/DaemonSetUpdated Updated DaemonSet.apps/vmware-vsphere-csi-driver-node -n openshift-cluster-csi-drivers because it changed (08:03:13Z) result=reject \nevent happened 25 times, something is wrong: namespace/openshift-cluster-csi-drivers deployment/vmware-vsphere-csi-driver-operator hmsg/5fc8006e70 - reason/DeploymentUpdated Updated Deployment.apps/vmware-vsphere-csi-driver-controller -n openshift-cluster-csi-drivers because it changed (08:03:02Z) result=reject",
    "test_name": "[Monitor:legacy-test-framework-invariants-pathological][sig-arch] events should not repeat pathologically for ns/openshift-cluster-csi-drivers"
  },
...

Assisted by: Claude

@openshift-ci-robot
Copy link

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 28, 2025
@dgoodwin dgoodwin changed the title test failures ai list tool WIP: API to list all test failures from a specific PR Nov 28, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 28, 2025

Walkthrough

Introduces PR test results API functionality by adding a new module to retrieve and serve test failure data from BigQuery, integrating it into the sippyserver with appropriate HTTP handlers and parameter validation for date ranges and PR identifiers.

Changes

Cohort / File(s) Summary
PR Test Results API
pkg/api/prtestresults.go
Adds PRTestResult struct and HTTP handler PrintPRTestResultsJSON to serve test results from BigQuery. Implements GetPRTestResults to fetch failures from junit_pr and junit tables, buildPRTestResultsQuery to construct BigQuery queries with optional success filtering and deduplication, executePRTestResultsQuery to execute queries, and deserializePRTestResult to map BigQuery rows to structs. Validates query parameters (org, repo, pr_number, date range), enforces 30-day maximum window, and handles date parsing.
Server Integration
pkg/sippyserver/server.go
Adds jsonPullRequestTestResults handler gated by BigQuery client availability, returning 400 if not configured. Registers new /api/pull_requests/test_results endpoint with ComponentReadinessCapability requirement and routes to api.PrintPRTestResultsJSON.
Parameter Validation
pkg/util/param/param.go
Introduces dateRegexp for YYYY-MM-DD format validation and extends parameter validation map with org, repo, pr_number, start_date, and end_date keys using appropriate validation rules.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • prtestresults.go: Contains dense logic for BigQuery query construction, date validation, deduplication strategy, and result deserialization. Multiple interdependent functions requiring careful validation of query correctness and edge case handling.
  • Date range enforcement: 30-day maximum window logic and civil.DateTime parsing for prowjob_start requires verification of correctness.
  • BigQuery table handling: Dual table queries (junit_pr and junit) with conditional inclusion of successes needs careful review for accuracy and performance implications.
  • Server integration: Verify proper gating of functionality and endpoint registration alongside existing pull request endpoints.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 28, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 28, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dgoodwin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 28, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 1358ea0 and a7c070d.

📒 Files selected for processing (3)
  • pkg/api/prtestresults.go (1 hunks)
  • pkg/sippyserver/server.go (2 hunks)
  • pkg/util/param/param.go (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • pkg/api/prtestresults.go
  • pkg/sippyserver/server.go
  • pkg/util/param/param.go
🧬 Code graph analysis (1)
pkg/sippyserver/server.go (2)
pkg/api/prtestresults.go (1)
  • PrintPRTestResultsJSON (290-402)
pkg/sippyserver/capabilities.go (1)
  • ComponentReadinessCapability (15-15)
🔇 Additional comments (5)
pkg/util/param/param.go (1)

29-29: LGTM!

The new parameter validations follow the established patterns in this file. The dateRegexp validates the YYYY-MM-DD format structurally, and semantic date validation (e.g., rejecting invalid dates like 2024-99-99) is correctly handled by time.Parse in the API handler.

Also applies to: 44-46, 52-53

pkg/sippyserver/server.go (2)

1026-1032: LGTM!

The handler follows the established pattern for BigQuery-dependent endpoints, with proper nil-check for bigQueryClient and delegation to the API implementation.


2028-2033: LGTM!

The endpoint registration is well-structured with appropriate capability requirements and clear documentation.

pkg/api/prtestresults.go (2)

39-75: LGTM!

The function correctly queries both BigQuery tables and combines results. The logging is comprehensive for debugging.


212-240: LGTM!

The query execution follows proper BigQuery iteration patterns with appropriate error handling.

Comment on lines 250 to 284
for i, field := range schema {
switch field.Name {
case "prowjob_build_id":
result.ProwJobBuildID = row[i].(string)
case "prowjob_name":
result.ProwJobName = row[i].(string)
case "prowjob_url":
if row[i] != nil {
result.ProwJobURL = row[i].(string)
}
case "pr_sha":
if row[i] != nil {
result.PRSha = row[i].(string)
}
case "prowjob_start":
if row[i] != nil {
// BigQuery returns civil.DateTime for DATETIME columns
civilDT := row[i].(civil.DateTime)
layout := "2006-01-02T15:04:05"
parsedTime, err := time.Parse(layout, civilDT.String())
if err != nil {
return PRTestResult{}, errors.Wrap(err, "failed to parse prowjob_start")
}
result.ProwJobStart = parsedTime
}
case "test_name":
result.TestName = row[i].(string)
case "testsuite":
result.TestSuite = row[i].(string)
case "flaked":
result.Flaked = row[i].(bool)
case "success":
result.Success = row[i].(bool)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Type assertions may panic on nil values.

Fields like prowjob_build_id, prowjob_name, test_name, testsuite, flaked, and success are type-asserted without nil checks. If BigQuery returns NULL for any of these columns, the code will panic at runtime.

Consider adding nil checks for all fields, consistent with how prowjob_url and pr_sha are handled:

 		case "prowjob_build_id":
-			result.ProwJobBuildID = row[i].(string)
+			if row[i] != nil {
+				result.ProwJobBuildID = row[i].(string)
+			}
 		case "prowjob_name":
-			result.ProwJobName = row[i].(string)
+			if row[i] != nil {
+				result.ProwJobName = row[i].(string)
+			}

Apply similar changes to test_name, testsuite, flaked, and success.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In pkg/api/prtestresults.go around lines 250 to 284, several case branches
perform direct type assertions that will panic if row[i] is nil; make them
consistent with prowjob_url and pr_sha by checking row[i] != nil before
asserting. For string fields (prowjob_build_id, prowjob_name, test_name,
testsuite) wrap the assignment in an if row[i] != nil { result.Field =
row[i].(string) } else leave as empty string; for boolean fields (flaked,
success) likewise check if row[i] != nil before asserting and assign only when
non-nil (otherwise leave default false). Keep the existing prowjob_start parsing
as-is.

Comment on lines 364 to 374
// Add one day to end_date to make it inclusive
endDate = endDate.AddDate(0, 0, 1)

// Validate date range
if endDate.Before(startDate) {
RespondWithJSON(http.StatusBadRequest, w, map[string]interface{}{
"code": http.StatusBadRequest,
"message": "end_date must be after start_date",
})
return
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Date validation order issue: adding a day before range check may allow invalid ranges.

The code adds 1 day to end_date (line 365) before validating that end_date is after start_date (line 368). This allows edge cases where the original end_date equals start_date minus 1 day to pass validation.

Example: start_date=2024-01-02, end_date=2024-01-01

  • After line 365: endDate = 2024-01-02
  • Line 368: endDate.Before(startDate) is false, validation passes
  • But the user intended an invalid range where end is before start.

Consider validating the original dates before adjusting:

+	// Validate date range before adjustment
+	if endDate.Before(startDate) {
+		RespondWithJSON(http.StatusBadRequest, w, map[string]interface{}{
+			"code":    http.StatusBadRequest,
+			"message": "end_date must be after or equal to start_date",
+		})
+		return
+	}
+
 	// Add one day to end_date to make it inclusive
 	endDate = endDate.AddDate(0, 0, 1)
-
-	// Validate date range
-	if endDate.Before(startDate) {
-		RespondWithJSON(http.StatusBadRequest, w, map[string]interface{}{
-			"code":    http.StatusBadRequest,
-			"message": "end_date must be after start_date",
-		})
-		return
-	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Add one day to end_date to make it inclusive
endDate = endDate.AddDate(0, 0, 1)
// Validate date range
if endDate.Before(startDate) {
RespondWithJSON(http.StatusBadRequest, w, map[string]interface{}{
"code": http.StatusBadRequest,
"message": "end_date must be after start_date",
})
return
}
// Validate date range before adjustment
if endDate.Before(startDate) {
RespondWithJSON(http.StatusBadRequest, w, map[string]interface{}{
"code": http.StatusBadRequest,
"message": "end_date must be after or equal to start_date",
})
return
}
// Add one day to end_date to make it inclusive
endDate = endDate.AddDate(0, 0, 1)
🤖 Prompt for AI Agents
In pkg/api/prtestresults.go around lines 364 to 374, the code currently adds one
day to endDate before checking the date range which can make originally-invalid
ranges appear valid; to fix this, validate that the original endDate is not
before startDate first (or compare startDate to a copy of endDateBeforeAdjust :=
endDate), return the bad request if endDate < startDate, and only after that add
one day to endDate to make the range inclusive.

@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 23, 2026
@dgoodwin dgoodwin changed the title WIP: API to list all test failures from a specific PR New APIs for all failures from a PR, and test failure messages from a PR Jan 23, 2026
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 23, 2026
@openshift-ci-robot
Copy link

Scheduling required tests:
/test e2e

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 26, 2026

@dgoodwin: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants