Fix logging for dropping capabilities by stepancheg · Pull Request #3436 · youki-dev/youki

stepancheg · 2026-03-01T09:12:59Z

Description

Logging is incomplete. If syscall fails, logs does not show what failed.

Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Refactoring (no functional changes)
Performance improvement
Test updates
CI/CD related changes
Other (please describe):

Testing

Added new unit tests
Added new integration tests
Ran existing test suite
Tested manually (please provide steps)

DEBUG libcontainer::capabilities: reset all caps
DEBUG libcontainer::capabilities: dropping bounding capabilities to {}
DEBUG libcontainer::capabilities: dropping effective capabilities to {Setgid, IpcOwner, AuditWrite, Chown, Bpf, CheckpointRestore, BlockSuspend, NetRaw, Setfcap, SysPtrace, SysAdmin, NetAdmin, SysModule, MacOverride, DacReadSearch, Mknod, NetBroadcast, AuditControl, Setuid, IpcLock, Setpcap, Fowner, SysTime, SysNice, Syslog, SysBoot, Lease, SysRawio, SysPacct, LinuxImmutable, WakeAlarm, Kill, SysChroot, SysResource, AuditRead, SysTtyConfig, MacAdmin, Perfmon, NetBindService, Fsetid, DacOverride}
DEBUG libcontainer::capabilities: dropping permitted capabilities to {SysAdmin, NetBindService, AuditControl, SysTime, Fowner, CheckpointRestore, Bpf, Setuid, Lease, WakeAlarm, Chown, Mknod, IpcOwner, Fsetid, Setfcap, Setpcap, SysNice, SysChroot, DacReadSearch, Syslog, AuditRead, Setgid, SysRawio, Kill, IpcLock, SysBoot, SysPtrace, NetBroadcast, SysPacct, SysResource, LinuxImmutable, Perfmon, NetAdmin, SysTtyConfig, BlockSuspend, AuditWrite, SysModule, MacAdmin, MacOverride, NetRaw, DacOverride}
DEBUG libcontainer::capabilities: dropping inheritable capabilities to {BlockSuspend, SysTtyConfig, Chown, SysChroot, SysPacct, NetBindService, SysNice, DacOverride, Mknod, Syslog, Perfmon, Bpf, NetRaw, Setgid, NetAdmin, DacReadSearch, SysBoot, NetBroadcast, Fsetid, SysResource, Kill, IpcOwner, Lease, IpcLock, AuditRead, Setfcap, MacAdmin, SysModule, Fowner, Setpcap, Setuid, WakeAlarm, SysAdmin, SysPtrace, AuditControl, SysRawio, SysTime, CheckpointRestore, LinuxImmutable, AuditWrite, MacOverride}
ERROR libcontainer::process::init::process: failed to drop capabilities err=SetCaps(CapsError("capset failure: Operation not permitted (os error 1)"))

Related Issues

Additional Context

Signed-off-by: Stepan Koltsov <stepan.koltsov@gmail.com>

saku3 · 2026-03-01T21:11:09Z

I think it’s still worth applying this fix. That said, to pinpoint the failure we’ll need a strace to see exactly which syscall returns an error.

YJDoc2 · 2026-03-02T10:04:57Z

For the logs here, can we have pair of logs for each? one before the syscall saying dropping xyz capabilities to ... and then one after the syscall saying successfully dropped xyz capabilities . That way we will have an start and end point for each op in logs, and we will know in which operation it failed.

@saku3 will this help in your point? we already have a log of

ERROR libcontainer::process::init::process: failed to drop capabilities err=SetCaps(CapsError("capset failure: Operation not permitted (os error 1)"))

that shows setcaps failed.

stepancheg · 2026-03-02T11:35:29Z

@YJDoc2

we already have a log of ... that shows setcaps failed.

It does not show which one, and what were the parameters.

saku3 · 2026-03-03T11:40:24Z

@YJDoc2
Thanks.

You're right — even without taking an strace, we can tell which syscall failed from the logs.

And as you suggested, it would be even better if we could also see which capability set the failure occurred in.

YJDoc2 · 2026-03-04T04:53:45Z

@stepancheg

It does not show which one, and what were the parameters.

Maybe I was not clear in my last comment, what I meant was we need this PR, the changes here should be added, and I will also request you to change the logs to have one before and one after the syscall, as I mentioned. After that, we will have logs to show which caps youki was working on, and we already have a log for syscall failed, so after this PR we should have a better understanding of cap set failures. We don't need any log in the syscall related code itself, because there is already a syscall failed log. Personally I don't think we need another log to show just the syscall params along with changes in this pr.

If you can make the changes to logging as suggested, I think this should be good to merge quickly, only logging changes, so as long as CI passes, we are good.

stepancheg · 2026-03-04T05:27:48Z

@YJDoc2 I checked youki code, nowhere there's second log line after successful operation, except maybe some heavy high level operations.

Separately, I think extensive logging might better be replaced with better error reporting. I would just introduce dependency on anyhow and use .context() function to produce error with stack and all diagnostic, because I doubt anyone will try to use match on specific error variants. That is, error may be like this:

Failed to start process

Caused by:
  1. ...
  2. ...
  3. Failed to set bounding capabilities to ...
  4. Setcap failed
  5. Operation not permitted (os error 1)

utam0k · 2026-03-30T11:12:16Z

@stepancheg In the case of anyhow, will errors in forked processes be displayed properly? I am just curious. Please let me know if you happen to know.

utam0k

There are other points to discuss, but this PR is worth merging.

YJDoc2

While I want to get these changes merged, I'd strongly prefer to have a success log after as well. For example, instead of just

tracing::debug!("dropping permitted capabilities to {:?}", permitted);
syscall.set_capability(CapSet::Permitted, &to_set(permitted))?;

we should have

tracing::debug!("dropping permitted capabilities to {:?}", permitted);
syscall.set_capability(CapSet::Permitted, &to_set(permitted))?;
tracing::debug!("successfully dropped permitted capabilities");

Current version is ok, I don't think above change should block the pr from merging, so approving, but would request this to be addressed in another pr.

stepancheg · 2026-03-30T11:26:13Z

In the case of anyhow, will errors in forked processes be displayed properly?

This code is executed in forked process? OoO.

I think there's a lot of things that may go wrong.

No, anyhow is not safe: malloc is illegal in forked process, and anyhow error always allocates.

Anyhow formatting per se is more or less safe (except backtrace formatting if backtrace capturing is enabled).

But even tracing::debug! likely buffers the message in a string, which is allocation and illegal.

utam0k · 2026-03-30T12:06:48Z

Regardless of the changes in this PR, I suppose to use anyhow throughout the entire project.

youki performs various tasks in the forked (cloned) process, but youki has a history of encountering all kinds of bugs in child processes. Determining how to provide user friendly errors is a difficult problem, and we still do not know what the correct approach is.

utam0k · 2026-03-30T12:08:41Z

@YJDoc2 If you don't mind, please merge this PR.

YJDoc2 · 2026-03-30T12:41:36Z

@YJDoc2 If you don't mind, please merge this PR.

Yes, my comment is not a blocker. Going ahead and merging :)

Fix logging for dropping capabilities

ca29917

Signed-off-by: Stepan Koltsov <stepan.koltsov@gmail.com>

stepancheg force-pushed the logging-capabilities branch from 307dfff to ca29917 Compare March 1, 2026 09:16

stepancheg mentioned this pull request Mar 1, 2026

[Bug]: failed to drop capabilities in youki exec #3434

Open

saku3 added the kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. label Mar 1, 2026

utam0k approved these changes Mar 30, 2026

View reviewed changes

utam0k requested a review from YJDoc2 March 30, 2026 11:13

YJDoc2 approved these changes Mar 30, 2026

View reviewed changes

YJDoc2 enabled auto-merge March 30, 2026 12:42

YJDoc2 disabled auto-merge March 30, 2026 12:43

YJDoc2 merged commit ff3c56d into youki-dev:main Mar 30, 2026
28 of 29 checks passed

github-actions bot mentioned this pull request Mar 30, 2026

Release for v0.6.1 #3440

Open

Conversation

stepancheg commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Testing

Related Issues

Additional Context

Uh oh!

saku3 commented Mar 1, 2026

Uh oh!

YJDoc2 commented Mar 2, 2026

Uh oh!

stepancheg commented Mar 2, 2026

Uh oh!

saku3 commented Mar 3, 2026

Uh oh!

YJDoc2 commented Mar 4, 2026

Uh oh!

stepancheg commented Mar 4, 2026

Uh oh!

utam0k commented Mar 30, 2026

Uh oh!

utam0k left a comment

Choose a reason for hiding this comment

Uh oh!

YJDoc2 left a comment

Choose a reason for hiding this comment

Uh oh!

stepancheg commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

utam0k commented Mar 30, 2026

Uh oh!

utam0k commented Mar 30, 2026

Uh oh!

YJDoc2 commented Mar 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

stepancheg commented Mar 1, 2026 •

edited

Loading

stepancheg commented Mar 30, 2026 •

edited

Loading