Skip to content

fix: analyze rhel* nodes#1918

Closed
davdhacs wants to merge 3 commits intomasterfrom
davdhacs/rhel-node-baseimage
Closed

fix: analyze rhel* nodes#1918
davdhacs wants to merge 3 commits intomasterfrom
davdhacs/rhel-node-baseimage

Conversation

@davdhacs
Copy link
Copy Markdown
Contributor

@davdhacs davdhacs commented Jun 5, 2025

OCP switched to using rhel baseimage nodes instead of rhcos.

before

example failure in ocp-next-candidate-qa-e2e (https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/stackrox_stackrox/14774/pull-ci-stackrox-stackrox-master-ocp-next-candidate-qa-e2e-tests/1929962151737298944):

node.getScan().getComponentsList().size() >= 4
|    |         |                   |      |
|    |         []                  0      false
|    io.stackrox.proto.storage.NodeOuterClass.NodeScan@7913d38f
id: "67b4cc3d-2c75-4ba7-b78b-f6402575914b"
name: "rox-ci-37298944-67z5w-worker-b-wzxv6"
cluster_id: "b4521497-d7a3-4eb6-8f32-00027c7c1e61"
cluster_name: "remote"
...
labels {
  key: "node.openshift.io/os_id"
  value: "rhel"
}

node-inventory log:

{"Event":"Running Scanner version 2.36.x-77-gbbd29cb742-dirty in Node Inventory mode","Level":"info","Location":"main.go:279","Time":"2025-06-03 20:50:13.425441"}
{"Event":"Launching backend GRPC listener on 127.0.0.1:8444","Level":"info","Location":"grpc.go:56","Time":"2025-06-03 20:50:13.426143"}
{"Event":"Unable to start node scanning for this namespace","Level":"warning","Location":"detection.go:49","Time":"2025-06-03 20:50:29.676477","detected namespace":"rhel:9","layer":"rox-ci-37298944-67z5w-worker-b-wzxv6"}
{"Event":"Error scanning node /host inventory: Node scanning is unsupported for this node","Level":"error","Location":"inventorizer.go:58","Time":"2025-06-03 20:50:29.676694"}
{"Event":"error analyzing node \"rox-ci-37298944-67z5w-worker-b-wzxv6\": Node scanning is unsupported for this node","Level":"error","Location":"service.go:49","Time":"2025-06-03 20:50:29.676745"}

after

example ocp-next-candidate-qa-e2e success with this change (https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/stackrox_stackrox/14774/pull-ci-stackrox-stackrox-master-ocp-next-candidate-qa-e2e-tests/1930502396882980864):

NodeInventoryTest > Verify node inventories and their scans STANDARD_OUT
...
    08:44:19 | INFO  | NodeInventoryTest         | NodeInventoryTest         | Waiting for scanner deployment to be ready
    08:44:19 | INFO  | NodeInventoryTest         | NodeInventoryTest         | Node rox-ci-82980864-kzqwm-worker-b-8t8zc scan contains 561 components
    08:44:19 | INFO  | NodeInventoryTest         | NodeInventoryTest         | Node rox-ci-82980864-kzqwm-master-1.c.acs-san-stackroxci.internal scan contains 561 components
    08:44:19 | INFO  | NodeInventoryTest         | NodeInventoryTest         | Node rox-ci-82980864-kzqwm-worker-a-w2g7k scan contains 561 components
    08:44:19 | INFO  | NodeInventoryTest         | NodeInventoryTest         | Node rox-ci-82980864-kzqwm-master-2.c.acs-san-stackroxci.internal scan contains 561 components
    08:44:19 | INFO  | NodeInventoryTest         | NodeInventoryTest         | Node rox-ci-82980864-kzqwm-master-0.c.acs-san-stackroxci.internal scan contains 561 components
    08:44:19 | INFO  | NodeInventoryTest         | NodeInventoryTest         | Ending testcase

node-inventory log:

{"Event":"Running Scanner version 2.36.x-82-ga0c4215edd-dirty in Node Inventory mode","Level":"info","Location":"main.go:279","Time":"2025-06-05 08:44:02.652414"}
{"Event":"Launching backend GRPC listener on 127.0.0.1:8444","Level":"info","Location":"grpc.go:56","Time":"2025-06-05 08:44:02.653594"}
{"Event":"Loading repo-to-cpe map into mem","Level":"info","Location":"singleton.go:22","Time":"2025-06-05 08:45:17.455207"}
{"Event":"Done loading repo-to-cpe map into mem","Level":"info","Location":"singleton.go:24","Time":"2025-06-05 08:45:17.477972"}
{"Event":"Finished node scan: node \"rox-ci-82980864-kzqwm-worker-a-w2g7k\" with 568 rhel-components and notes: [LANGUAGE_CVES_UNAVAILABLE]","Level":"info","Location":"service.go:58","Time":"2025-06-05 08:45:17.479993"}

example on OCP 4.18:

before https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/stackrox_stackrox/14774/pull-ci-stackrox-stackrox-master-ocp-4-18-qa-e2e-tests/1927779159271018496

node-inventory log:

{"Event":"Running Scanner version 2.36.x-71-gdf1cab833b-dirty in Node Inventory mode","Level":"info","Location":"main.go:279","Time":"2025-05-28 20:20:57.258200"}
{"Event":"Launching backend GRPC listener on 127.0.0.1:8444","Level":"info","Location":"grpc.go:56","Time":"2025-05-28 20:20:57.258943"}
{"Event":"Loading repo-to-cpe map into mem","Level":"info","Location":"singleton.go:22","Time":"2025-05-28 20:21:17.388270"}
{"Event":"Done loading repo-to-cpe map into mem","Level":"info","Location":"singleton.go:24","Time":"2025-05-28 20:21:17.419967"}
{"Event":"Finished node scan: node \"rox-ci-71018496-rwhkd-worker-b-7bf4h\" with 564 rhel-components and notes: [LANGUAGE_CVES_UNAVAILABLE]","Level":"info","Location":"service.go:58","Time":"2025-05-28 20:21:17.422396"}

after https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/stackrox_stackrox/14774/pull-ci-stackrox-stackrox-master-ocp-4-18-qa-e2e-tests/1930502396731985920

node-inventory log:

{"Event":"Running Scanner version 2.36.x-82-ga0c4215edd-dirty in Node Inventory mode","Level":"info","Location":"main.go:279","Time":"2025-06-05 08:34:54.100067"}
{"Event":"Launching backend GRPC listener on 127.0.0.1:8444","Level":"info","Location":"grpc.go:56","Time":"2025-06-05 08:34:54.100994"}
{"Event":"Loading repo-to-cpe map into mem","Level":"info","Location":"singleton.go:22","Time":"2025-06-05 08:38:52.294122"}
{"Event":"Done loading repo-to-cpe map into mem","Level":"info","Location":"singleton.go:24","Time":"2025-06-05 08:38:52.324479"}
{"Event":"Finished node scan: node \"rox-ci-31985920-5wr4w-master-2.c.acs-san-stackroxci.internal\" with 564 rhel-components and notes: [LANGUAGE_CVES_UNAVAILABLE]","Level":"info","Location":"service.go:58","Time":"2025-06-05 08:38:52.326283"}

related:
complianceAsCode change: ComplianceAsCode/content#13369
openshift-installer ticket: https://issues.redhat.com//browse/COS-3014

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Jun 5, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@davdhacs
Copy link
Copy Markdown
Contributor Author

davdhacs commented Jun 5, 2025

Testing with this scanner build in stackrox/stackrox#14774 shows e2e tests passing which were failing on master for ocp-next-candidate tests: https://prow.ci.openshift.org/pr-history/?org=stackrox&repo=stackrox&pr=14774

@davdhacs davdhacs changed the title analyze rhel* nodes fix: analyze rhel* nodes Jun 5, 2025
@davdhacs davdhacs marked this pull request as ready for review June 5, 2025 13:54
@davdhacs davdhacs requested a review from a team as a code owner June 5, 2025 13:54
@vikin91
Copy link
Copy Markdown
Contributor

vikin91 commented Jun 5, 2025

Where is this change coming from? Why this is considered a fix - is this expected to work?
The node scanning will work for RHEL, but the reliability of vulnerability matching can be affected by providing false-positive or false-negative results.

This should not be an issue in scanning v2 as we now offer the same with scanner v4, but I am curious what is the motivation for this change.

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Jun 5, 2025

@davdhacs: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-tests a0c4215 link false /test e2e-tests

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@davdhacs
Copy link
Copy Markdown
Contributor Author

davdhacs commented Jun 5, 2025

Where is this change coming from? Why this is considered a fix - is this expected to work? The node scanning will work for RHEL, but the reliability of vulnerability matching can be affected by providing false-positive or false-negative results.

This should not be an issue in scanning v2 as we now offer the same with scanner v4, but I am curious what is the motivation for this change.

Thanks for looking at this! This is to fix the NodeInventoryTest on OCP 4.19 (example: https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/stackrox_stackrox/14774/pull-ci-stackrox-stackrox-master-ocp-next-candidate-qa-e2e-tests/1929962151737298944).

I do not know enough about scanner v2 vs. v4: does the NodeInventoryTest use scanner v2 only? If we need to change the test instead of making this change in ScannerV2, then that seems better anyway.

@vikin91
Copy link
Copy Markdown
Contributor

vikin91 commented Jun 5, 2025

Thanks for looking at this! This is to fix the NodeInventoryTest on OCP 4.19 (example: https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/stackrox_stackrox/14774/pull-ci-stackrox-stackrox-master-ocp-next-candidate-qa-e2e-tests/1929962151737298944).

I do not know enough about scanner v2 vs. v4: does the NodeInventoryTest use scanner v2 only? If we need to change the test instead of making this change in ScannerV2, then that seems better anyway.

Ok, so I understand that the motivation is to fix a failing test.
Yes, "Node Inventory" is for Scanner v2. For Scanner v4 we would use the "Node Index" (which is roughly the same thing, but we changed the naming convention to fit into claircore).
I would maybe suggest to rather disable that test instead as we have not tested that scanning on RHEL machines and we limited the scope to RHCOS on purpose.
Maybe the scanner team would know whether it is safe to scan RHEL nodes as if they were RHCOS?

@vikin91
Copy link
Copy Markdown
Contributor

vikin91 commented Jun 5, 2025

But that is a big change if OCP 4.19 does not use RHCOS anymore. Let's inform the PM

@davdhacs
Copy link
Copy Markdown
Contributor Author

davdhacs commented Jun 5, 2025

Discussion in chat, https://redhat-internal.slack.com/archives/C033Z8KMZAM/p1749135523469219?thread_ts=1749130373.340809&cid=C033Z8KMZAM
related jira ticket for the change in ocp: https://issues.redhat.com/browse/OCPSTRAT-1190
"""
RHCOS has always been RHEL, but it has also always had its own compose and set of extra OCP-specific packages. With this feature, RHCOS would be broken down into three distinct layers:
the rhel-bootc layer, coming from image mode for RHEL
the CoreOS layer, which adds CoreOS-specific packages and scripts
the OpenShift node layer, which adds OpenShift-specific packages and scripts
"""

@davdhacs
Copy link
Copy Markdown
Contributor Author

davdhacs commented Jun 5, 2025

Cancelled - the test will be turned off

@davdhacs davdhacs closed this Jun 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants