-
Notifications
You must be signed in to change notification settings - Fork 14
Description
When implementing #854 / ruby#15219, I carried over a pattern that seemed concerning.
The implementation is split across Ruby and native code, Rust in this case. Here is a simplified version:
class RubyVM::YJIT
def self.enable(stats: ...)
return false if enabled?
if Primitive.cexpr! 'RBOOL(rb_zjit_enabled_p)'
# as an aside, should this actually raise? something really strange is going on in your code if this actually happens
warn("Only one JIT can be enabled at the same time.")
return false
end
# check args and raise ArgumentError if needed
at_exit { print_and_dump_stats } if stats
Primitive.rb_yjit_enable(stats, ...)
end
end#[no_mangle]
pub extern "C" fn rb_yjit_enable(...) -> VALUE {
with_vm_lock(src_loc!(), || {
// without rechecking the enabled state, we proceed to mutate internal VM state
// ...
yjit_init();
// Add "+YJIT" to RUBY_DESCRIPTION
extern "C" {
fn ruby_set_yjit_description();
}
unsafe { ruby_set_yjit_description(); }
Qtrue
})
}To my eyes, this seems problematic. By checking for the enabled state (both enabled? itself, and also checking for if the other JIT is active) in the Ruby code, it looks like we have a time-of-check to time-of-use race here, and it is possible to context switch in between, and more than one threads/ractors can end up running the JIT-enable block multiple times resulting in a crash, or enabling YJIT/ZJIT at the same time, when that becomes a possibility.
I believe the only correct way to do the check is to do it in native code after acquiring the lock, but as I looked into this more, I found another potential problem. According to this piece of documentation, we can't call back into ruby while holding the VM lock, but one of the first thing we do in yjit_init() is calling the JIT hooks with rb_funcall().
I'm also not sure if this is the right lock either, as the documentation says it doesn't preclude other ractors from running. We are mutating a bunch of global variables (static mut) in the block, if it doesn't block out other readers, that seems problematic also (e.g. another ractor can try to read yjit_enabled_p while it's being written).
(Aside: right now, you won't succeed in enabling the JIT from the non-main ractor, because the JIT hooks are stored in a non-frozen class instance variable, so that will eventually raise an error. I think it's totally reasonable to restrict enabling the JIT to the main factor, but this seems like relying on a a fragile implementation detail to get that result.)
Happy to work on a fix – and I think I need to resolve this before I can add support for calling the JIT hooks to ZJIT – but I think I'll need some help brainstorming the right approach here.