Use a weak equality map instead of _id2ref#7
Conversation
|
@jeremyevans Review request was suggested by GH... feel free to ignore if you have no interest in this library. |
|
Those runs were clearly not completing so I will try to figure out why it hangs with my version. |
The ObjectSpace._id2ref method is considered an internal API of CRuby and is difficult to implement efficiently on implementations that do not have direct control over the garbage-collected heap. On JRuby, for example, arbitrary _id2ref object lookup is normally disabled, as the implementation requires maintaining a parallel mapping of object IDs to weak object references, adding a very large amount of overhead to all object ID uses. As a result, Ruby standard libraries should try to avoid using _id2ref. This patch introduces a weak map into DRb for managing object references without _id2ref. The map is weak-valued, with a clean subroutine to scrub out defunct entries. Puts initiate a clean, hopefully ensuring that very few stale entries accumulate. This patch does not use WeakMap do to limitations of its API. WeakMap specifies that keys are identity-based, and as of Ruby 3.0 allows immediate values like Symbols and Integers. In order to support this behavior, those values must be idempotent, which on JRuby is impossible to support for fixnum-ranged Integers and flonum-ranged Floats, rendering it useless as an object ID-keyed cache.
|
I see now there is a WeakIdConv that implements something similar using WeakMap. That may be acceptable if users realize that fixnum/flonum keys on JRuby have no guarantee of idempotency. So I thought I would try to run the tests using that as the default diff --git a/test/drb/test_drbobject.rb b/test/drb/test_drbobject.rb
index 2b0e206..57d9783 100644
--- a/test/drb/test_drbobject.rb
+++ b/test/drb/test_drbobject.rb
@@ -36,7 +36,7 @@ class TestDRbObject < Test::Unit::TestCase
include DRbObjectTest
def setup
- DRb.start_service
+ DRb.start_service(nil, nil, {:idconv => DRb::WeakIdConv.new})
end
end
Any reason we wouldn't want to just make this the default |
76d8c83 to
2b393cd
Compare
|
cc @seki who is the official maintainer of drb (https://github.com/ruby/ruby/blob/master/doc/maintainers.rdoc) |
jeremyevans
left a comment
There was a problem hiding this comment.
I'm OK with the general approach, though I think the implementation needs a few changes. Please see inline comments. Also, I'm not the drb maintainer, so I'll add seki as a reviewer.
| obj.nil? ? nil : obj.__id__ | ||
| when BasicObject | ||
| obj.__id__ | ||
| obj.nil? ? nil : _put(obj) |
There was a problem hiding this comment.
This seems wrong. BasicObject#nil? is not defined. You should probably keep the case obj, and just switch the obj.__id__ to _put(obj) in both when branches. I would guess that is what causes the test failure for basic objects.
| begin | ||
| yield | ||
| ensure | ||
| MUTEX.unlock |
There was a problem hiding this comment.
The idiomatic way here would be to use MUTEX.synchronize(&block). However, maybe there a reason the Thread.handle_interrupt and explicit lock/unlock was done instead. If so, could you add a comment why?
There was a problem hiding this comment.
Mostly to avoid the overhead of calling the block in synchronize, since this will be hit hard by anyone using a remote object. Is that worth a comment?
There was a problem hiding this comment.
I would switch to MUTEX.synchronize(&block) unless you have a benchmark showing the performance differences are significant enough to warrant this more complex code. If the performance differences are significant, then keeping the code and adding a comment explaining it seems best. My uneducated guess would be that the overhead from having a remote object is much higher than the overhead of a mutex synchronization, even on localhost.
There was a problem hiding this comment.
Fair enough. The block form should be slower on CRuby, but it may be noise compared to the weak map and other remote object overhead.
| _safe_lock do | ||
| weakref = @id2ref[id] | ||
| if weakref | ||
| result = weakref.__getobj__ rescue nil |
There was a problem hiding this comment.
I'm not sure if this handles false values correctly, returning them as nil. I would think you would only want to delete from the hash if __getobj__ raises, and otherwise return the result. Explicit nil values are probably OK, since I'm guessing they result in the same behavior.
| MUTEX = Mutex.new | ||
|
|
||
| def initialize | ||
| @id2ref = {} |
There was a problem hiding this comment.
Would ObjectSpace::WeakMap make sense here? That would keep the entries as long as the value is alive, which seems the desired semantic here.
There was a problem hiding this comment.
That works only for Ruby 2.7 and higher, since it would require immediates (integer IDs) as keys. That restriction might be ok, since 2.6 did not ship drb as a gem. It would mean this gem is not usable at all on 2.6.
$ rvm ruby-2.6.7 do ruby -e 'ObjectSpace::WeakMap.new[self.__id__] = self'
-e:1:in `[]=': cannot define finalizer for Integer (ArgumentError)
from -e:1:in `<main>'
There was a problem hiding this comment.
This gem specifies >= 2.7 so that seems fine:
Line 16 in 82643ca
Also, WeakIdConv already uses ObjectSpace::WeakMap.
I think making WeakIdConv the default is probably the safest way.
There was a problem hiding this comment.
I would suggest that we should also remove the id2ref version in order to start moving away from having any official code that uses it. The WeakMap version will be more reliable and should work on 2.7+ compatible implementations.
A quick Line 1351 seems relevant. |
|
@seki The main outstanding question is whether we should just replace the id2ref implementation with the WeakMap implementation, since that version should be more reliable and id2ref is problematic and effectively deprecated. If this is acceptable, I can modify this PR to use the WeakIdConv and remove the current default id2ref logic. |
|
I close this in favor of #35. |
This PR moves away from using _id2ref for object references, instead using a weak-valued Hash with appropriate locking and cleaning logic.
From the primary commit:
This implementation currently hangs partway through the tests, and I am unsure why. I wanted to get this implementation out there to get help refining it and getting tests passing, since _id2ref is very problematic on JRuby and JRuby would like to work with the released versions of DRb when we release our Ruby 3.0-compatible release later this year.