Skip to content

Conversation

@eregon
Copy link
Member

@eregon eregon commented Jun 14, 2025

@eregon
Copy link
Member Author

eregon commented Jun 14, 2025

I have reviewed the usages in CRuby, and all seem fine in that they should raise FrozenError anyway if given a frozen String.

Except the usage in

rb_str_locktmp(arguments->string);

IO::Buffer.for("str".freeze) { p 1 } gives FrozenError when it was fine before, so I'll change the code to skip the rb_str_locktmp() if the string is frozen.
I ran both test-all and test-spec and neither found this, so it seems IO::Buffer lacks some test coverage.
cc @ioquatix

@eregon
Copy link
Member Author

eregon commented Jun 14, 2025

Actually I'm not sure what is the right behavior there, is there a valid use case to use IO::Buffer.for("str".freeze) { ... }?
Tests in test/ruby/test_io_buffer.rb don't cover it, maybe because it's not really useful anyway?

@eregon eregon requested review from byroot and ioquatix June 14, 2025 11:54
@ioquatix
Copy link
Member

I need some time to review this change.

@eregon
Copy link
Member Author

eregon commented Jun 15, 2025

OK, FWIW, I'm confident this change in IO::Buffer is correct, because the only thing that locking a string does is prevent mutations, e.g. as tested in

def test_string_mapped_buffer_locked
string = "Hello World"
IO::Buffer.for(string) do |buffer|
# Cannot modify string as it's locked by the buffer:
assert_raise RuntimeError do
string[0] = "h"
end
end
end
.
But a frozen string already cannot be mutated, so freezing is stronger than locking and the locking is then redundant.

@eregon
Copy link
Member Author

eregon commented Jun 15, 2025

Ah I've seen https://bugs.ruby-lang.org/issues/20998#note-16.
It's not clear to me if rb_str_locktmp() prevents the char* to move (the docs of rb_str_locktmp() don't mention that).
Looking at usages of STR_TMPLOCK only rb_str_tmp_frozen_release() seems potentially about that, and that does the check like !FL_TEST_RAW(orig, STR_TMPLOCK|RUBY_FL_FREEZE) so it will still behave the same there.

Maybe there is an explicit function to mark a String as shouldn't be moved by GC?
I recalled ruby/fiddle#44 maybe @tenderlove or @peterzhu2118 can advice what's a good way to avoid the char* to be moved (in this case for a frozen String)?

@ioquatix
Copy link
Member

ioquatix commented Jun 15, 2025

In io.c, rb_str_locktmp is used for this purpose - to prevent the underlying pointer to the memory changing while it is in use by a system call.

  • Obtains a "temporary lock" of the string. This advisory locking mechanism
  • prevents other cooperating threads from tampering the receiver. The same
  • thing could be done via freeze mechanism, but this one can also be unlocked
  • using rb_str_unlocktmp().

The entire point is to prevent another thread from causing the string to be reallocated (tampering the receiver).

@byroot
Copy link
Member

byroot commented Jun 15, 2025

what's a good way to avoid the char* to be moved (in this case for a frozen String)?

If you want to avoid GC compaction moving the string, you have to pin it.

To do that rb_io_buffer_type.dmark must call rb_gc_mark (instead of rb_gc_mark_movable). Looking at the code it seems to currently be using the declarative marking API, so I suspect it's currently movable. You'd have to rollback to implementing a .dmark function (or have a way to define that a field should be pinned with declarative marking).

@eregon
Copy link
Member Author

eregon commented Jun 16, 2025

@ioquatix The way I read that comment, it says the freeze mechanism (= .freeze) is as good as rb_str_locktmp().

The entire point is to prevent another thread from causing the string to be reallocated (tampering the receiver).

Freezing already prevents that, it prevents all mutations.

tampering the receiver in that doc is AFAIK only about mutations by other Ruby threads.
i.e. it is not about the GC moving the String.

@eregon
Copy link
Member Author

eregon commented Jun 16, 2025

I would like to know under which circumstances the char* of a String, obtained e.g .via RSTRING_PTR(str) can move, and it would be great to document it (I checked extension.rdoc and didn't find much about it).
I think there is a fair but of confusion around that.

For example the char* of a String cannot move "at any time", most C extension code relies on it not moving, and the GC is not able to patch the char* variables on stack.
In fact the GC can't even move the VALUE of the String if it's on the native stack, as again it can't change that.
So I guess if the VALUE of the String is on stack it's effectively pinned for as long as it's on the native stack, is that correct?

I guess the only cases the char* of a String can change are:

  • When the String is mutated, e.g. via String#<< and the buffer needs to be reallocated (not possible when frozen obviously).
  • When GC compaction runs, which AFAIK can only happen while executing Ruby code not C extension code.
  • When Ruby decides to unshare the storage of multiple Strings maybe?

In our case for IO::Buffer it refers directly to the char* in

void *base;

without getting it out of the String at usages. So relying on the String VALUE to be on stack doesn't work.

In my understanding, I'm not changing anything about GC compaction here, IO::Buffer and rb_str_locktmp() already does not prevent GC compaction to move the String, and freezing is as strong (even stronger) than rb_str_locktmp().
So I think it's out of scope of this PR to fix that, and it's fine to merge as-is.

@eregon
Copy link
Member Author

eregon commented Jun 16, 2025

I think with that last comment it's clear that GC compaction + IO::Buffer is an orthogonal issue to this change, and this change doesn't make it any worse. In short, rb_str_locktmp() doesn't prevent the char* to move via GC compaction, as far as I can see.
So I'll merge this.

It's good that made us think about GC compaction though, there might be a pre-existing bug there, and if so it would be worth opening an issue to track and fix it.
EDIT: there is already https://bugs.ruby-lang.org/issues/21210

@eregon eregon merged commit 2956573 into ruby:master Jun 16, 2025
87 checks passed
@byroot
Copy link
Member

byroot commented Jun 17, 2025

So I guess if the VALUE of the String is on stack it's effectively pinned for as long as it's on the native stack, is that correct?

That is correct.

When GC compaction runs, which AFAIK can only happen while executing Ruby code not C extension code.

Any allocation my trigger GC, regardless if whether it was triggered from C or Ruby code.

When Ruby decides to unshare the storage of multiple Strings maybe?

Yes, in general any mutating rb_str_ method may cause the char * to change, either because it has to xrealloc or something like that. But if the string is frozen, the only thing that could invalidate the char * is compaction.

rb_str_locktmp() doesn't prevent the char* to move via GC compaction, as far as I can see.

That is correct too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants