Skip to content

Use role instead of playbooks - 99-logs.yml#3032

Merged
amartyasinha merged 1 commit intomainfrom
simplify-99-logs
Aug 19, 2025
Merged

Use role instead of playbooks - 99-logs.yml#3032
amartyasinha merged 1 commit intomainfrom
simplify-99-logs

Conversation

@bshewale
Copy link
Copy Markdown
Contributor

@bshewale bshewale commented Jun 3, 2025

It is continuation of simplification job execution 1.

@bshewale bshewale requested review from a team as code owners June 3, 2025 10:01
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jun 3, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Copy Markdown
Contributor

@fmount fmount left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems more converging the entire ci-framework logging topic into the existing must-gather role.
I see this as a wrong move because we risk to confuse the actions we execute in the role that shouldn't break the principle of managing a single thing (which is running openstack-must-gather).
I'm not sure why removing a playbook and unpacking it into this role results in a simplification, but I see this as an attempt to rename must-gather into a logging role, while we should move the common code into a logging role.

Comment thread roles/os_must_gather/tasks/99-logs.yml Outdated
Comment thread deploy-edpm.yml Outdated
@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/91ae268010a94a25ada04e2d938a09ca

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 37m 33s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 20m 29s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 24m 37s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 17s
cifmw-pod-pre-commit FAILURE in 8m 17s
✔️ build-push-container-cifmw-client SUCCESS in 17m 33s
✔️ cifmw-molecule-os_must_gather SUCCESS in 26m 41s

@danpawlik
Copy link
Copy Markdown
Contributor

This seems more converging the entire ci-framework logging topic into the existing must-gather role. I see this as a wrong move because we risk to confuse the actions we execute in the role that shouldn't break the principle of managing a single thing (which is running openstack-must-gather). I'm not sure why removing a playbook and unpacking it into this role results in a simplification, but I see this as an attempt to rename must-gather into a logging role, while we should move the common code into a logging role.

If we are watching on that from this PR - no, that "simplification" is just an illusion, but if we check another playbooks what we were executing - then yes, it is simplification. Import playbook does not give transparent workflow, where moving that to roles:

  • easy understand what is executed - no need to check which playbook was executing that role, because in the logs playbook name is not raised
  • better control variables,
  • in some part faster execution due there is no need to make delegation or gather_facts,
  • easier to debug

About this PR: So if you think that openstack-must-gather is better for collecting logs, than let's remove the 99-logs play and use it directly. Is it ok @fmount ?

@fmount
Copy link
Copy Markdown
Contributor

fmount commented Jun 3, 2025

This seems more converging the entire ci-framework logging topic into the existing must-gather role. I see this as a wrong move because we risk to confuse the actions we execute in the role that shouldn't break the principle of managing a single thing (which is running openstack-must-gather). I'm not sure why removing a playbook and unpacking it into this role results in a simplification, but I see this as an attempt to rename must-gather into a logging role, while we should move the common code into a logging role.

If we are watching on that from this PR - no, that "simplification" is just an illusion, but if we check another playbooks what we were executing - then yes, it is simplification. Import playbook does not give transparent workflow, where moving that to roles:

* easy understand what is executed - no need to check which playbook was executing that role, because in the logs playbook name is not raised

* better control variables,

* in some part faster execution due there is no need to make delegation or gather_facts,

* easier to debug

About this PR: So if you think that openstack-must-gather is better for collecting logs, than let's remove the 99-logs play and use it directly. Is it ok @fmount ?

@danpawlik ack, my point is simply related to the fact that I see tasks moved to the must-gather role that are not related to openstack-must-gather, but more of an orchestration of the entire logging policy.
For the record I'm ok with whatever you feel is better, but if you feel that ci-framework logging (must-gather is just a part of that) needs to be converted into a role (your points about roles are valid), then I think a cifmw_logging role is probably a better fit rather than converging that logic into must gather.

@danpawlik
Copy link
Copy Markdown
Contributor

danpawlik commented Jun 4, 2025

@fmount my point is simply related to the fact that I see tasks moved to the must-gather role that are not related to openstack-must-gather - not really. Tasks are comming from https://github.com/openstack-k8s-operators/ci-framework/blob/main/playbooks/99-logs.yml , copy-paste to new file which is located in roles. What @bshewale should mention in this PR and it is missing right now - in the file playbooks/99-logs.yml there should be a comment:

#
# NOTE: Playbook migrated to: cifmw_setup/tasks/SOMETHING.
# DO NOT EDIT THAT PLAYBOOK. IT WOULD BE REMOVED IN NEAR FUTURE.
#

At this point my suggestion is:

  • move content from playbooks/99-logs.yml to role
  • add comment to the playbooks/99-logs.yml
  • when all playbooks would be migrated, drop playbooks
  • migrate role execution related to 99-logs.yml and use openstack-must-gather

Alternative way:

  • add comment to the playbooks/99-logs.yml
  • in places, where we are calling 99-logs.yml we should call openstack-must-gather

is it ok @fmount ?

@fmount
Copy link
Copy Markdown
Contributor

fmount commented Jun 6, 2025

  • places, where we are calling 99-

The alternative way sounds good. You should include must-gather where you need, not the other way around.

@bshewale bshewale force-pushed the simplify-99-logs branch 3 times, most recently from 2a9af59 to b800a34 Compare June 11, 2025 11:07
@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/d9f0fc0886eb4e5c9abddbedf23dc974

✔️ openstack-k8s-operators-content-provider SUCCESS in 36m 49s
podified-multinode-edpm-deployment-crc FAILURE in 16m 26s
cifmw-crc-podified-edpm-baremetal FAILURE in 21m 44s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 57s
cifmw-pod-pre-commit FAILURE in 6m 29s
✔️ build-push-container-cifmw-client SUCCESS in 17m 40s

@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/2059a3510eb943a49089abdeb5ebec37

✔️ openstack-k8s-operators-content-provider SUCCESS in 36m 14s
podified-multinode-edpm-deployment-crc FAILURE in 17m 34s
cifmw-crc-podified-edpm-baremetal FAILURE in 20m 54s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 29s
cifmw-pod-pre-commit FAILURE in 9m 01s
✔️ build-push-container-cifmw-client SUCCESS in 22m 22s

@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/28d7c2413989454c8cedd6f6eb9980ba

openstack-k8s-operators-content-provider FAILURE in 4m 04s
⚠️ podified-multinode-edpm-deployment-crc SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ cifmw-crc-podified-edpm-baremetal SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 7m 50s
cifmw-pod-pre-commit FAILURE in 5m 46s
cifmw-content-provider-build-images FAILURE in 2m 58s
✔️ cifmw-edpm-build-images SUCCESS in 13m 15s
cifmw-multinode-kuttl POST_FAILURE in 14m 44s
cifmw-tcib FAILURE in 3m 46s
ci-framework-openstack-meta-content-provider FAILURE in 4m 10s
✔️ build-push-container-cifmw-client SUCCESS in 19m 37s
✔️ cifmw-molecule-ci_setup SUCCESS in 3m 28s

@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/1064f149b47f4ecea216a2a2a8181d62

openstack-k8s-operators-content-provider FAILURE in 4m 03s
⚠️ podified-multinode-edpm-deployment-crc SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ cifmw-crc-podified-edpm-baremetal SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 27s
cifmw-pod-pre-commit FAILURE in 5m 49s
cifmw-content-provider-build-images FAILURE in 2m 54s
✔️ cifmw-edpm-build-images SUCCESS in 19m 31s
cifmw-multinode-kuttl POST_FAILURE in 14m 05s
cifmw-tcib FAILURE in 4m 04s
ci-framework-openstack-meta-content-provider FAILURE in 4m 01s
✔️ build-push-container-cifmw-client SUCCESS in 15m 05s
✔️ cifmw-molecule-ci_setup SUCCESS in 3m 19s

@bshewale
Copy link
Copy Markdown
Contributor Author

IMO, We can't just replace the 99-logs.yml playbook with os_must_gather role (as suggested in the comments) as we are calling roles inside the 99-logs.yml so created new playbook in the ci_setup role name - run_logs.yml to replace 99-logs.yml . As it's specific to our cifmw related task so better to add that in the ci_setup role instead of calling that in the other roles like os_must_gather

@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/532a9d55d21c494e8dc0f1549a6fffb3

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 53m 13s
podified-multinode-edpm-deployment-crc POST_FAILURE in 1h 02m 44s
cifmw-crc-podified-edpm-baremetal POST_FAILURE in 1h 36m 14s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 12s
✔️ cifmw-pod-pre-commit SUCCESS in 7m 33s
✔️ cifmw-content-provider-build-images SUCCESS in 15m 42s
✔️ cifmw-edpm-build-images SUCCESS in 20m 19s
cifmw-multinode-kuttl POST_FAILURE in 2h 26m 38s
✔️ cifmw-tcib SUCCESS in 46m 58s
✔️ ci-framework-openstack-meta-content-provider SUCCESS in 56m 05s
✔️ build-push-container-cifmw-client SUCCESS in 18m 54s
✔️ cifmw-molecule-ci_setup SUCCESS in 3m 22s

@bshewale
Copy link
Copy Markdown
Contributor Author

recheck POST_FAILURE

@bshewale
Copy link
Copy Markdown
Contributor Author

recheck

@softwarefactory-project-zuul
Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/3f8233ac2ecd45848132f5bc3d601308

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 42m 27s
podified-multinode-edpm-deployment-crc POST_FAILURE in 1h 28m 44s
cifmw-crc-podified-edpm-baremetal POST_FAILURE in 47m 40s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 32s
✔️ cifmw-pod-pre-commit SUCCESS in 7m 47s
✔️ cifmw-content-provider-build-images SUCCESS in 14m 10s
✔️ cifmw-edpm-build-images SUCCESS in 16m 52s
cifmw-multinode-kuttl POST_FAILURE in 2h 31m 37s
✔️ cifmw-tcib SUCCESS in 44m 23s
✔️ ci-framework-openstack-meta-content-provider SUCCESS in 51m 12s
build-push-container-cifmw-client FAILURE in 18m 25s
✔️ cifmw-molecule-ci_setup SUCCESS in 3m 20s

amartyasinha
amartyasinha previously approved these changes Jul 28, 2025
@github-actions
Copy link
Copy Markdown

This PR is stale because it has been for over 15 days with no activity.
Remove stale label or comment or this will be closed in 7 days.

It is continuation of simplification job execution [1].

[1]: #2929
Comment thread roles/cifmw_setup/tasks/run_logs.yml
@danpawlik
Copy link
Copy Markdown
Contributor

please check @fmount if its ok

Copy link
Copy Markdown
Contributor

@evallesp evallesp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Copy Markdown
Contributor

@fmount fmount left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci Bot added the lgtm label Aug 19, 2025
@amartyasinha amartyasinha merged commit cc87f54 into main Aug 19, 2025
6 checks passed
@amartyasinha amartyasinha deleted the simplify-99-logs branch August 19, 2025 09:17
@fmount
Copy link
Copy Markdown
Contributor

fmount commented Aug 20, 2025

I think this patch needs a follow up to run the must-gather role on the existing jobs :/

@cescgina
Copy link
Copy Markdown
Contributor

I think this patch needs a follow up to run the must-gather role on the existing jobs :/

@fmount I created one in #3204

cescgina added a commit to cescgina/ci-framework that referenced this pull request Aug 20, 2025
After merging[1], jobs have stopped running must-gather to collect
service logs, e.g [2,3]. This seems to be caused by the change in [4].
Where before the 99-logs.yml playbook was run from an 'ansible-playbook'
invocation, without any inventory and thus the check for
'zuul_log_collection' was never triggered because the variable was never
set. Now, when importing the cifmw_setup role, the cifmw variables are
loaded and that variable which is in most jobs set to True prevents the
log collection, and most critically, must-gather collection.
This change removes the condition to preserve the previous behaviour.

[1] openstack-k8s-operators#3032
[2] https://softwarefactory-project.io/zuul/t/rdoproject.org/build/63673310b4f94883a9aff2db719a2095
[3] https://softwarefactory-project.io/zuul/t/rdoproject.org/build/03e87102fbf04dc7b482ecdba7128085
[4] openstack-k8s-operators@cc87f54#diff-691dae244b4fb4bfc1270fb5d3b708c6be6073b0e33d997db15850edfa3fd976L23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants