Skip to content

std.posix: Rewriting enums to seperate clock IDs for clock_gettime() and timerfd_create()#21347

Closed
chrboesch wants to merge 0 commit intoziglang:masterfrom
chrboesch:i20334
Closed

std.posix: Rewriting enums to seperate clock IDs for clock_gettime() and timerfd_create()#21347
chrboesch wants to merge 0 commit intoziglang:masterfrom
chrboesch:i20334

Conversation

@chrboesch
Copy link
Contributor

@chrboesch chrboesch commented Sep 8, 2024

In the linux kernel header file (time.h) is the note, that the driver implementing the clock IDs

  • SGI_CYCLE
  • TAI
    got removed. Therefore, calling timerfd_create() with these IDs will result in an error. So I disabled them.

@chrboesch chrboesch changed the title Disable no longer used clock IDs in Linux std.os.linux: Disable no longer used clock IDs Sep 8, 2024
Copy link
Member

@alexrp alexrp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine to me.

However, I do not believe this fixes #20334 because there are still more clocks defined here than timerfd_create() will accept. And it seems like there needs to be some discussion on that issue as to how that should be addressed.

@chrboesch
Copy link
Contributor Author

I have tried every other clock ID on my linux computer and it works fine and as expected. And since the clock IDs are defined as enums, contrary to the assumption of the thread creator, you can only select the ones entered. So it will definitely fix the issue.

@alexrp
Copy link
Member

alexrp commented Sep 10, 2024

So just to be clear, even clocks like PROCESS_CPUTIME_ID and THREAD_CPUTIME_ID work with timerfd_create()?

@chrboesch
Copy link
Contributor Author

The fact that you ask means it doesn't work on your computer. Let me do some testing on different computers.

@alexrp
Copy link
Member

alexrp commented Sep 10, 2024

No no, I haven't actually tried. I'm only asking because those specifically are somewhat special clocks, so I just wouldn't take it for granted that they work with timerfd_create(). I'm happy to take your word for it if you've tested them and they worked.

@chrboesch
Copy link
Contributor Author

chrboesch commented Sep 10, 2024

First of all, thanks for your skepticism. It was my bad that I did some quick tests in C on my Gentoo with self-compiled kernel. And that I mixed timerfd_create with getclock_time because there is no separation in the header file time.h. Interestingly, the timers even work in Zig even though they shouldn't. But only on my self-compiled kernel. And sometimes only with root rights.

However, the man page timerfd_create(2) says there are actually only two valid timers: CLOCK_REALTIME and CLOCK_MONOTONIC. So, I think we should limit the enum of these clock IDs, because it doesn't make sense if a timer sometimes works and sometimes doesn't. And it's hard to explain that you can achieve this with some special kernel settings. 😉

What surprises me a little, however, is that the equivalent POSIX command is 'timer_create'. And it also knows CLOCK_PROCESS_CPUTIME_ID and CLOCK_THREAD_CPUTIME_ID, but is triggered by different Linux commands. In other words, the over all change should not only be limited to restricting the enum, but also to correctly mapping the POSIX command.

TL;DR
My suggestion: Firstly, we limit the enum to the actually possible clock IDs (that fixes the issue) and secondly, we should correct the POSIX command later. What do you think?

Addendum: Since clock_id is also used in Zig for clock_gettime, this is a bit more complicated. We therefore need to have a second enum just for timercreate_fd. That's probably what the issue creator meant?

Addendum 2: To add to the confusion, Zig uses clockid and clockid_t equivalently, which is incorrect, as Linux distinguishes between them. This makes it difficult to simply add an enum clockid.

Addendum 3: It's not Zig's fault, it's Linux's fault. There, the term 'clockid' is used in the same way in different contexts, which contributes to this confusion. These are the things that contribute to instability and errors in operation because developers mistakenly confuse the two. I'll think about how we can make this better for Zig.

@alexrp
Copy link
Member

alexrp commented Sep 10, 2024

First of all, thanks for your skepticism. It was my bad that I did some quick tests in C on my Gentoo with self-compiled kernel. And that I mixed timerfd_create with getclock_time because there is no separation in the header file time.h. Interestingly, the timers even work in Zig even though they shouldn't. But only on my self-compiled kernel. And sometimes only with root rights.

It wouldn't surprise me if some distros disable those clocks I mentioned for non-superusers out of security concerns or something like that. But I'm just guessing here.

However, the man page timerfd_create(2) says there are actually only two valid timers: CLOCK_REALTIME and CLOCK_MONOTONIC. So, I think we should limit the enum of these clock IDs, because it doesn't make sense if a timer sometimes works and sometimes doesn't. And it's hard to explain that you can achieve this with some special kernel settings. 😉

Generally speaking, the Linux kernel code is the source of truth; the man pages are sometimes incomplete or wrong. I think we should base our decision on what the kernel code is written to allow. If it is the case (and I don't know if it is) that the kernel can return EINVAL for some clock IDs depending on factors outside of the programmer's control, then EINVAL should be considered an expected error and be converted to a Zig error accordingly.

I think all the Linux system calls that take clock IDs should be checked to see what's allowed before we decide what to do here (other than removing SGI_CYCLE and TAI, which should be uncontroversial).

@chrboesch
Copy link
Contributor Author

Ok, I removed the "fixes #20334", because it doesn't. So the PR is valid. And then we should discuss in the issue what we should do to solve the problem in general.

@chrboesch chrboesch requested a review from kprotty as a code owner September 11, 2024 10:01
@chrboesch chrboesch changed the title std.os.linux: Disable no longer used clock IDs std.posix: Rewriting enums to seperate clock IDs for clock_gettime() and timerfd_create() Sep 11, 2024
Copy link
Contributor

@rootbeer rootbeer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the cleanup of clock_gettime(). I've got a couple drive-by comments, but feel free to ignore them if they don't make sense to you.


pub const CLOCK_ID = clock_id;

pub const clock_id = enum(u32) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a short comment here and on clockid_t differentiating them? (I think just labelling which API/syscall they're aimed at would be sufficient? clock_gettime vs timerfd_create?)

Copy link
Contributor Author

@chrboesch chrboesch Sep 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, makes sense.

if (native_os != .linux) return error.SkipZigTest;

const tfd = try posix.timerfd_create(.MONOTONIC, .{ .CLOEXEC = true });
const tfd = try posix.timerfd_create(posix.clock_id.MONOTONIC, .{ .CLOEXEC = true });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The posix.clock_id. prefix shouldn't be necessary here, Zig should pick the right .MONOTONIC from the type signature of timerfd_create.

Copy link
Contributor Author

@chrboesch chrboesch Sep 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok.

@chrboesch
Copy link
Contributor Author

Now this PR fixes #20334.

return syscall2(.eventfd2, count, flags);
}

pub fn timerfd_create(clockid: clockid_t, flags: TFD) usize {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked at the PR in a few days so I might be missing some context here - why are we changing this to an integer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I based my work on the kernel sources, there it is an integer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't mean it quite that literally when I said the kernel code should be the basis for our decision here. 🙂 We do use enums in syscall wrappers when the set of acceptable values is known ahead of time - which I think is the case here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was unsure but can change that.

lib/std/c.zig Outdated
Comment on lines +220 to +221
pub const CLOCK_ID = clock_id;
pub const clock_id = switch (native_os) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub const CLOCK_ID = clock_id;
pub const clock_id = switch (native_os) {
pub const TIMERFD_CLOCK = clock_id;
pub const timerfd_clockid_t = switch (native_os) {

Let's use naming that's more consistent with the existing CLOCK/clockid_t.

lib/std/c.zig Outdated
MONOTONIC = 1,
_,
},
else => void,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
else => void,
else => clockid_t,

When we don't know better, just fall back to clockid_t as before. Otherwise this represents a compile error regression on affected platforms.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, but the line numbers from your suggestion are outdated. I think I have to open a new PR?

@chrboesch chrboesch closed this Jan 27, 2025
@chrboesch chrboesch deleted the i20334 branch January 27, 2025 09:50
@chrboesch chrboesch restored the i20334 branch January 27, 2025 11:02
@chrboesch
Copy link
Contributor Author

Why is this branch closed now, I only wanted to resolve the suggestions?

@mlugg
Copy link
Member

mlugg commented Jan 27, 2025

You're the one who closed it; you should be able to reopen it fine. If GitHub doesn't let you reopen this one, feel free to make a new PR.

@chrboesch
Copy link
Contributor Author

You're the one who closed it;

But how did I close, with clicking on "resolve"?

@mlugg
Copy link
Member

mlugg commented Jan 27, 2025

Hitting "resolve" wouldn't have closed the PR. You might have accidentally hit the "close" button at the bottom of the page, or you might have accidentally deleted the branch (it appears that it was deleted then restored) leading to GitHub automatically closing the PR.

@chrboesch
Copy link
Contributor Author

Github shows:
Pull request successfully merged and closed

You’re all set—the chrboesch:i20334 branch can be safely deleted.

@alexrp
Copy link
Member

alexrp commented Jan 27, 2025

Are you sure you didn't accidentally force-push to your i20334 branch or something?

Have a look at git reflog just to be sure.

@chrboesch
Copy link
Contributor Author

Have a look at git reflog just to be sure.

What should I see?

@alexrp
Copy link
Member

alexrp commented Jan 27, 2025

Hard to give a general answer to that, but basically look for potentially destructive operations like reset and rebase.

@chrboesch
Copy link
Contributor Author

On my local branch is nothing to see with "force", "reset" or "rebase". And here in PR the entire history is gone. How can that be?

@chrboesch
Copy link
Contributor Author

Try to reopen.

@chrboesch
Copy link
Contributor Author

This somehow doesn't make any sense. I can only create a new PR from my local repo and link to it. Right?

@alexrp
Copy link
Member

alexrp commented Jan 27, 2025

Well I'm not sure what happened, but the good news is your orphaned commits still exist on the GitHub side (for now; they get GC'd eventually): https://github.com/chrboesch/zig/commits/08c5c4cedfee5dc22d4d23c798410ef62223494c?author=chrboesch

So you can at least manually pick them out, which is better than nothing. I suggest just creating a new branch + PR.

@chrboesch
Copy link
Contributor Author

Thanks will do that.

@chrboesch chrboesch deleted the i20334 branch January 27, 2025 12:17
@chrboesch
Copy link
Contributor Author

@alexrp I restored my backup of the old branch and with it all changes for a new PR. 😃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants