Skip to content

Fix logging for dropping capabilities#3436

Merged
YJDoc2 merged 1 commit intoyouki-dev:mainfrom
stepancheg:logging-capabilities
Mar 30, 2026
Merged

Fix logging for dropping capabilities#3436
YJDoc2 merged 1 commit intoyouki-dev:mainfrom
stepancheg:logging-capabilities

Conversation

@stepancheg
Copy link
Copy Markdown
Contributor

@stepancheg stepancheg commented Mar 1, 2026

Description

Logging is incomplete. If syscall fails, logs does not show what failed.

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)
  • Performance improvement
  • Test updates
  • CI/CD related changes
  • Other (please describe):

Testing

  • Added new unit tests
  • Added new integration tests
  • Ran existing test suite
  • Tested manually (please provide steps)
DEBUG libcontainer::capabilities: reset all caps
DEBUG libcontainer::capabilities: dropping bounding capabilities to {}
DEBUG libcontainer::capabilities: dropping effective capabilities to {Setgid, IpcOwner, AuditWrite, Chown, Bpf, CheckpointRestore, BlockSuspend, NetRaw, Setfcap, SysPtrace, SysAdmin, NetAdmin, SysModule, MacOverride, DacReadSearch, Mknod, NetBroadcast, AuditControl, Setuid, IpcLock, Setpcap, Fowner, SysTime, SysNice, Syslog, SysBoot, Lease, SysRawio, SysPacct, LinuxImmutable, WakeAlarm, Kill, SysChroot, SysResource, AuditRead, SysTtyConfig, MacAdmin, Perfmon, NetBindService, Fsetid, DacOverride}
DEBUG libcontainer::capabilities: dropping permitted capabilities to {SysAdmin, NetBindService, AuditControl, SysTime, Fowner, CheckpointRestore, Bpf, Setuid, Lease, WakeAlarm, Chown, Mknod, IpcOwner, Fsetid, Setfcap, Setpcap, SysNice, SysChroot, DacReadSearch, Syslog, AuditRead, Setgid, SysRawio, Kill, IpcLock, SysBoot, SysPtrace, NetBroadcast, SysPacct, SysResource, LinuxImmutable, Perfmon, NetAdmin, SysTtyConfig, BlockSuspend, AuditWrite, SysModule, MacAdmin, MacOverride, NetRaw, DacOverride}
DEBUG libcontainer::capabilities: dropping inheritable capabilities to {BlockSuspend, SysTtyConfig, Chown, SysChroot, SysPacct, NetBindService, SysNice, DacOverride, Mknod, Syslog, Perfmon, Bpf, NetRaw, Setgid, NetAdmin, DacReadSearch, SysBoot, NetBroadcast, Fsetid, SysResource, Kill, IpcOwner, Lease, IpcLock, AuditRead, Setfcap, MacAdmin, SysModule, Fowner, Setpcap, Setuid, WakeAlarm, SysAdmin, SysPtrace, AuditControl, SysRawio, SysTime, CheckpointRestore, LinuxImmutable, AuditWrite, MacOverride}
ERROR libcontainer::process::init::process: failed to drop capabilities err=SetCaps(CapsError("capset failure: Operation not permitted (os error 1)"))

Related Issues

Additional Context

Signed-off-by: Stepan Koltsov <stepan.koltsov@gmail.com>
@stepancheg stepancheg force-pushed the logging-capabilities branch from 307dfff to ca29917 Compare March 1, 2026 09:16
@saku3 saku3 added the kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. label Mar 1, 2026
@saku3
Copy link
Copy Markdown
Member

saku3 commented Mar 1, 2026

I think it’s still worth applying this fix. That said, to pinpoint the failure we’ll need a strace to see exactly which syscall returns an error.

@YJDoc2
Copy link
Copy Markdown
Collaborator

YJDoc2 commented Mar 2, 2026

For the logs here, can we have pair of logs for each? one before the syscall saying dropping xyz capabilities to ... and then one after the syscall saying successfully dropped xyz capabilities . That way we will have an start and end point for each op in logs, and we will know in which operation it failed.

@saku3 will this help in your point? we already have a log of

ERROR libcontainer::process::init::process: failed to drop capabilities err=SetCaps(CapsError("capset failure: Operation not permitted (os error 1)"))

that shows setcaps failed.

@stepancheg
Copy link
Copy Markdown
Contributor Author

@YJDoc2

we already have a log of ... that shows setcaps failed.

It does not show which one, and what were the parameters.

@saku3
Copy link
Copy Markdown
Member

saku3 commented Mar 3, 2026

@YJDoc2
Thanks.

You're right — even without taking an strace, we can tell which syscall failed from the logs.

And as you suggested, it would be even better if we could also see which capability set the failure occurred in.

@YJDoc2
Copy link
Copy Markdown
Collaborator

YJDoc2 commented Mar 4, 2026

@stepancheg

It does not show which one, and what were the parameters.

Maybe I was not clear in my last comment, what I meant was we need this PR, the changes here should be added, and I will also request you to change the logs to have one before and one after the syscall, as I mentioned. After that, we will have logs to show which caps youki was working on, and we already have a log for syscall failed, so after this PR we should have a better understanding of cap set failures. We don't need any log in the syscall related code itself, because there is already a syscall failed log. Personally I don't think we need another log to show just the syscall params along with changes in this pr.

If you can make the changes to logging as suggested, I think this should be good to merge quickly, only logging changes, so as long as CI passes, we are good.

@stepancheg
Copy link
Copy Markdown
Contributor Author

@YJDoc2 I checked youki code, nowhere there's second log line after successful operation, except maybe some heavy high level operations.

Separately, I think extensive logging might better be replaced with better error reporting. I would just introduce dependency on anyhow and use .context() function to produce error with stack and all diagnostic, because I doubt anyone will try to use match on specific error variants. That is, error may be like this:

Failed to start process

Caused by:
  1. ...
  2. ...
  3. Failed to set bounding capabilities to ...
  4. Setcap failed
  5. Operation not permitted (os error 1)

@utam0k
Copy link
Copy Markdown
Member

utam0k commented Mar 30, 2026

@stepancheg In the case of anyhow, will errors in forked processes be displayed properly? I am just curious. Please let me know if you happen to know.

Copy link
Copy Markdown
Member

@utam0k utam0k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are other points to discuss, but this PR is worth merging.

@utam0k utam0k requested a review from YJDoc2 March 30, 2026 11:13
Copy link
Copy Markdown
Collaborator

@YJDoc2 YJDoc2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I want to get these changes merged, I'd strongly prefer to have a success log after as well. For example, instead of just

tracing::debug!("dropping permitted capabilities to {:?}", permitted);
syscall.set_capability(CapSet::Permitted, &to_set(permitted))?;

we should have

tracing::debug!("dropping permitted capabilities to {:?}", permitted);
syscall.set_capability(CapSet::Permitted, &to_set(permitted))?;
tracing::debug!("successfully dropped permitted capabilities");

Current version is ok, I don't think above change should block the pr from merging, so approving, but would request this to be addressed in another pr.

@stepancheg
Copy link
Copy Markdown
Contributor Author

stepancheg commented Mar 30, 2026

In the case of anyhow, will errors in forked processes be displayed properly?

This code is executed in forked process? OoO.

I think there's a lot of things that may go wrong.

No, anyhow is not safe: malloc is illegal in forked process, and anyhow error always allocates.

Anyhow formatting per se is more or less safe (except backtrace formatting if backtrace capturing is enabled).

But even tracing::debug! likely buffers the message in a string, which is allocation and illegal.

@utam0k
Copy link
Copy Markdown
Member

utam0k commented Mar 30, 2026

Regardless of the changes in this PR, I suppose to use anyhow throughout the entire project.

youki performs various tasks in the forked (cloned) process, but youki has a history of encountering all kinds of bugs in child processes. Determining how to provide user friendly errors is a difficult problem, and we still do not know what the correct approach is.

@utam0k
Copy link
Copy Markdown
Member

utam0k commented Mar 30, 2026

@YJDoc2 If you don't mind, please merge this PR.

@YJDoc2
Copy link
Copy Markdown
Collaborator

YJDoc2 commented Mar 30, 2026

@YJDoc2 If you don't mind, please merge this PR.

Yes, my comment is not a blocker. Going ahead and merging :)

@YJDoc2 YJDoc2 enabled auto-merge March 30, 2026 12:42
@YJDoc2 YJDoc2 disabled auto-merge March 30, 2026 12:43
@YJDoc2 YJDoc2 merged commit ff3c56d into youki-dev:main Mar 30, 2026
28 of 29 checks passed
@github-actions github-actions bot mentioned this pull request Mar 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants