Skip to content

Conversation

@pmoravec
Copy link
Contributor

Forgetting to provide mandatory username of the AAP containerized deployment leads to skipping plugin completely and lack of collected data.

Let add a heuristic "is there a sole user running some podman container?" to detect the user in such a case.

Closes: #4150


Please place an 'X' inside each '[]' to confirm you adhere to our Contributor Guidelines

  • Is the commit message split over multiple lines and hard-wrapped at 72 characters?
  • Is the subject and message clear and concise?
  • Does the subject start with [plugin_name] if submitting a plugin patch or a [section_name] if part of the core sosreport code?
  • Does the commit contain a Signed-off-by: First Lastname email@example.com?
  • Are any related Issues or existing PRs properly referenced via a Closes (Issue) or Resolved (PR) line?
  • Are all passwords or private data gathered by this PR obfuscated?

@pmoravec
Copy link
Contributor Author

Kindly asking @snagoor for a review.

Esp. I am not fully happy with the heuristic what user runs a podman container (what if all pods are down? what if multiple users run some container?). But I am not aware of anything better.

pmoravec added a commit to pmoravec/sos that referenced this pull request Oct 29, 2025
Forgetting to provide mandatory username of the AAP containerized
deployment leads to skipping plugin completely and lack of collected
data.

Let add a heuristic "is there a sole user running some podman
container?" to detect the user in such a case.

Closes: sosreport#4150

Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
@pmoravec pmoravec force-pushed the sos-pmoravec-aap-containerized-username-heuristic branch from 6cbee72 to 8253ed1 Compare October 29, 2025 13:50
@packit-as-a-service
Copy link

Congratulations! One of the builds has completed. 🍾

You can install the built RPMs by following these steps:

  • sudo yum install -y dnf-plugins-core on RHEL 8
  • sudo dnf install -y dnf-plugins-core on Fedora
  • dnf copr enable packit/sosreport-sos-4150
  • And now you can install the packages.

Please note that the RPMs should be used only in a testing environment.

@pmoravec pmoravec added Reviewed/Needs 2nd Ack Require a 2nd ack from a maintainer Status/Needs Review This issue still needs a review from project members Kind/RedHat RedHat related item labels Oct 29, 2025
Copy link
Member

@TurboTurtle TurboTurtle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack to code, minor nit/suggestion below.

Forgetting to provide mandatory username of the AAP containerized
deployment leads to skipping plugin completely and lack of collected
data.

Let add a heuristic "is there a sole user running some podman
container?" to detect the user in such a case.

Closes: sosreport#4150

Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
@pmoravec pmoravec force-pushed the sos-pmoravec-aap-containerized-username-heuristic branch from 8253ed1 to d8b9885 Compare October 29, 2025 19:04
@snagoor
Copy link
Contributor

snagoor commented Oct 30, 2025

Kindly asking @snagoor for a review.

Esp. I am not fully happy with the heuristic what user runs a podman container (what if all pods are down? what if multiple users run some container?). But I am not aware of anything better.

@pmoravec thanks for tagging me. AAP Containerized deployment uses a non-root user (as it runs rootless podman containers). We can't reliably predict the username as its set by the end-user as per their choice. So, its mandatory for the end-user to provide the username or else we can skip collecting the data from this plugin (this was my intention when I decided to write it initially).

I will try to answer your queries here

  1. If pods are completely down, still it collects logs from that container too (we use ps -a), see https://github.com/sosreport/sos/blob/main/sos/report/plugins/aap_containerized.py#L130

  2. Can multiple containers users run containers? >>> Not possible, as we specificially dedicate a single non-root user to run the AAP Containerized Setup as per the doc https://docs.redhat.com/en/documentation/red_hat_ansible_automation_platform/2.4/html/containerized_ansible_automation_platform_installation_guide/aap-containerized-installation

I will go through this PR and share my feedback.

@snagoor
Copy link
Contributor

snagoor commented Nov 3, 2025

Thanks @pmoravec for the update. I appreciate the effort to improve usability, but I have a few concerns:

  • The heuristic for detecting the AAP username via ps aux is unreliable in edge cases—especially when no containers are running or multiple users use Podman.

  • AAP containerized deployments use a non-root user chosen by the end-user, so automatic detection isn't guaranteed to be accurate.

  • The original plugin design intentionally required the username to be provided to avoid ambiguity, which I still believe is the safest approach.

  • Even if we retain the heuristic, it should be optional, with fallback logic that avoids incorrectly capturing the data not relevant to AAP (another root-less user pods can also exist on the system, but its a corner case).

So, my final word would be stick to the old design and log a warning to pass the username as its mandatory for this plugin.

@pmoravec
Copy link
Contributor Author

pmoravec commented Nov 4, 2025

Thanks for the feedback. My aim was to improve the current behaviour - detect by some heuristic (with limitations we both are aware of, but those are rather corner cases) the right user only in the case the user is not properly set. Running sos report -k aap_containerized.username=myaapuser will always use the myaapuser username, regardless of this change.

The use case "somebody forgets to set the username" happens. It happened to me (who forgot to ask for the plugin option), it happened to a few colleagues, it can happen to a customer who simply forgets to set it despite said. Then we get a sosreport with largely incomplete information and extra time of one more roundtrip to the customer is needed.

My aim is to limit those delays, nothing else.

I know the heuristic has its limitations as you describe. Still I see it beneficial in case the AAP username is not set (not such rare case..? I can check to have some stats here), since - rule of thumb percentages follow:

  • the heuristic will properly identify the AAP user in say 90% of cases - this is the improvement
  • the heuristic will not detect any user (all pods are down OR multiple users run similar pods) in 8% cases - same behaviour, plugin not collecting anything
  • the heuristic will detect a wrong user (only a different user runs very similar pods) in 2% cases - a regression that can confuse the support engineer interpreting the sosreport. Either the engineer recognizes wrong user was detected (then no harm up to lost of their time to identify this), or the engineer will be confused (a real big cons).

Comparing the pros and cons (on guesstimated percentages), I think the overall is a gain. Further, we can improve the heuristic in either way:

  • have more strict pattern matching from the ps output to limit false positive hits. When a process has so many parameters like e.g.:
aap       337366  0.0  0.0 226468  2316 ?        -    11:06   0:00 /usr/bin/conmon --api-version 1 -c 622206ecf7b8e49a65c0ec53fbc61a0a27dad6cb929a513c4491b8fc728d4796 -u 622206ecf7b8e49a65c0ec53fbc61a0a27dad6cb929a513c4491b8fc728d4796 -r /usr/bin/crun -b /var/lib/aap/.local/share/containers/storage/overlay-containers/622206ecf7b8e49a65c0ec53fbc61a0a27dad6cb929a513c4491b8fc728d4796/userdata -p /run/user/1001/containers/overlay-containers/622206ecf7b8e49a65c0ec53fbc61a0a27dad6cb929a513c4491b8fc728d4796/userdata/pidfile -n automation-hub-api --exit-dir /run/user/1001/libpod/tmp/exits --persist-dir /run/user/1001/libpod/tmp/persist/622206ecf7b8e49a65c0ec53fbc61a0a27dad6cb929a513c4491b8fc728d4796 --full-attach -s -l journald --log-level warning --syslog --runtime-arg --log-format=json --runtime-arg --log --runtime-arg=/run/user/1001/containers/overlay-containers/622206ecf7b8e49a65c0ec53fbc61a0a27dad6cb929a513c4491b8fc728d4796/userdata/oci-log --conmon-pidfile /run/user/1001/containers/overlay-containers/622206ecf7b8e49a65c0ec53fbc61a0a27dad6cb929a513c4491b8fc728d4796/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /var/lib/aap/.local/share/containers/storage --exit-command-arg --runroot --exit-command-arg /run/user/1001/containers --exit-command-arg --log-level --exit-command-arg warning --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /run/user/1001/libpod/tmp --exit-command-arg --network-config-dir --exit-command-arg  --exit-command-arg --network-backend --exit-command-arg netavark --exit-command-arg --volumepath --exit-command-arg /var/lib/aap/.local/share/containers/storage/volumes --exit-command-arg --db-backend --exit-command-arg sqlite --exit-command-arg --transient-store=false --exit-command-arg --runtime --exit-command-arg crun --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --events-backend --exit-command-arg file --exit-command-arg container --exit-command-arg cleanup --exit-command-arg --stopped-only --exit-command-arg 622206ecf7b8e49a65c0ec53fbc61a0a27dad6cb929a513c4491b8fc728d4796

we should be able to detect them with pretty high probability

  • we can save to a file (sos_commands/aap_containerized/podman_user, e.g.) the username, with notion sos guessed it only. This will be a hint for support engineers to prevent confusion.

  • or we can come up with a completely different approach, yet better heuristic. E.g. traversing homes of all users (not on NFS, that is terribly slow) and checking who has proper containers declared in ~/.local/share/containers/storage/overlay-containers/containers.json (if the file exists at all). Again, this heuristic can sometimes find no user (and again no harm for us as we wont see an improvement by running the heuristic) and might sometimes find a wrong user (if just a different user installed AAP containers..? could this scenario even happen?)

Does either idea sound useful?

@snagoor
Copy link
Contributor

snagoor commented Nov 5, 2025

@pmoravec I'm okay with the suggested change. Logging the guessed username with a disclaimer will help avoid confusion. Let's proceed with this approach.

@pmoravec
Copy link
Contributor Author

pmoravec commented Nov 5, 2025

@TurboTurtle is this good to merge or wanna you (re)review?

@TurboTurtle TurboTurtle added Reviewed/Ready for Merge Has been reviewed, ready for merge and removed Reviewed/Needs 2nd Ack Require a 2nd ack from a maintainer Status/Needs Review This issue still needs a review from project members labels Nov 5, 2025
@TurboTurtle TurboTurtle merged commit 2b3841c into sosreport:main Nov 5, 2025
37 checks passed
dwolstroRH pushed a commit to dwolstroRH/sos that referenced this pull request Nov 5, 2025
Forgetting to provide mandatory username of the AAP containerized
deployment leads to skipping plugin completely and lack of collected
data.

Let add a heuristic "is there a sole user running some podman
container?" to detect the user in such a case.

Closes: sosreport#4150

Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Kind/RedHat RedHat related item Reviewed/Ready for Merge Has been reviewed, ready for merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants