YJIT/ZJIT.enable is not thread/ractor-safe

When implementing #854 / ruby/ruby#15219, I carried over a pattern that seemed concerning.

The implementation is split across Ruby and native code, Rust in this case. Here is a simplified version:

```rb
class RubyVM::YJIT
  def self.enable(stats: ...)
    return false if enabled?

    if Primitive.cexpr! 'RBOOL(rb_zjit_enabled_p)'
      # as an aside, should this actually raise? something really strange is going on in your code if this actually happens
      warn("Only one JIT can be enabled at the same time.")
      return false
    end

    # check args and raise ArgumentError if needed

    at_exit { print_and_dump_stats } if stats

    Primitive.rb_yjit_enable(stats, ...)
  end
end
```

```rs
#[no_mangle]
pub extern "C" fn rb_yjit_enable(...) -> VALUE {
    with_vm_lock(src_loc!(), || {
        // without rechecking the enabled state, we proceed to mutate internal VM state
        // ...

        yjit_init();

        // Add "+YJIT" to RUBY_DESCRIPTION
        extern "C" {
            fn ruby_set_yjit_description();
        }
        unsafe { ruby_set_yjit_description(); }

        Qtrue
    })
}
```

To my eyes, this seems problematic. By checking for the enabled state (both `enabled?` itself, and also checking for if the other JIT is active) in the Ruby code, it looks like we have a time-of-check to time-of-use race here, and it is possible to context switch in between, and more than one threads/ractors can end up running the JIT-enable block multiple times resulting in a crash, or enabling YJIT/ZJIT at the same time, when that becomes a possibility.

I believe the only correct way to do the check is to do it in native code after acquiring the lock, but as I looked into this more, I found another potential problem. According to [this piece of documentation](https://docs.ruby-lang.org/en/master/contributing/concurrency_guide_md.html#label-The+VM+Lock), we can't call back into ruby while holding the VM lock, but one of the first thing we do in `yjit_init()` is calling the JIT hooks with `rb_funcall()`.

I'm also not sure if this is _the right lock_ either, as the documentation says it doesn't preclude other ractors from running. We are mutating a bunch of global variables (`static mut`) in the block, if it doesn't block out other readers, that seems problematic also (e.g. another ractor can try to read `yjit_enabled_p` while it's being written).

(Aside: right now, you won't succeed in enabling the JIT from the non-main ractor, because the JIT hooks are stored in a non-frozen class instance variable, so that will eventually raise an error. I think it's totally reasonable to restrict enabling the JIT to the main factor, but this seems like relying on a a fragile implementation detail to get that result.)

Happy to work on a fix – and I think I need to resolve this before I can add support for calling the JIT hooks to ZJIT – but I think I'll need some help brainstorming the right approach here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

YJIT/ZJIT.enable is not thread/ractor-safe #883

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

YJIT/ZJIT.enable is not thread/ractor-safe #883

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions