Conversation
406caaf to
e996a2c
Compare
leventov
left a comment
There was a problem hiding this comment.
Thanks for these findings! The bugs are real, but they should be fixed differently.
There was a problem hiding this comment.
The former code here is indeed buggy, however putting synchronized on getUnderlyingProvider() is not how this method should be fixed.
Please make the following:
- Declare
configfieldfinal - Add a comment noting that
FileSessionCredentialsProvideris safely initialized because it contains afinalfieldsessionCredentials. Also a comment should be added on that field noting thatfinalmodifier should not be removed from it, because it has concurrency implications inLazyFileSessionCredentialsProvider. - In
getUnderlyingProvider(),providershould be read into a local variable, local variable updated in thesynchronizedblock along with the field, and the local is returned in the last line of the method. This is because currently the second read ofproviderin the last line of the method could returnnulleven if the first read in the first line returned non-null and thesynchronizedblock was not entered.
There was a problem hiding this comment.
@leventov do you mean that?
private final AWSCredentialsConfig config;
@Nullable
private FileSessionCredentialsProvider provider;
public LazyFileSessionCredentialsProvider(AWSCredentialsConfig config)
{
this.config = config;
}
//FileSessionCredentialsProvider is safely initialized because it contains a final field sessionCredentials.
private synchronized FileSessionCredentialsProvider getUnderlyingProvider()
{
FileSessionCredentialsProvider syncedProvider = provider;
if (syncedProvider == null) {
synchronized (config) {
syncedProvider = new FileSessionCredentialsProvider(config.getFileSessionCredentials());
provider = syncedProvider;
}
}
return syncedProvider;
}There was a problem hiding this comment.
- No
synchronizedon the method - There should be another check inside the
synchronizedblock (because currently it's not double checked locking :) - I would put the comment before the
provider = syncedProvider;line:
// The provider field doesn't need to be volatile because FileSessionCredentialsProvider
// is safely initialized because it contains a final field sessionCredentials.
Added "The provider field doesn't need to be volatile" because that's what specifically needs to be explained here.
There was a problem hiding this comment.
Also for the comment to be accurate, this PR should be merged not earlier than this one: #6664
There was a problem hiding this comment.
This post helps make sense of what @leventov is saying: https://shipilev.net/blog/2014/safe-public-construction/. I admit I had to read it before understanding what was meant by the comment "FileSessionCredentialsProvider is safely initialized because it contains a final field sessionCredentials".
@leventov - am I understanding you right - are you suggesting that we should rely on the fact that an object with at least one final field will always be safely initialized, and skip the volatile? i.e. in the terms used in the post linked above, you are suggesting it's best to do "Unsafe Local DCL + Safe Singleton" in this case?
If so, I don't feel super comfortable about that. The reason is that it is easy to screw it up: removing a final field in a different class (the class being initialized) breaks the holder class's correctness. Comments help with this problem, but IMO, it makes life easier to make the provider field volatile, which makes the correctness obvious from just looking at the container class (in this case, LazyFileSessionCredentialsProvider). Limiting the scope of what a maintainer needs to consider to a single file is a good thing, since it makes future maintenance easier and less error-prone.
Separately from that, it is not clear to me that this behavior is in fact guaranteed. Please educate me if I am missing something (I haven't studied the relevant sections of the Java spec) but it seems like the "do a totally safe initialization if any final is written" behavior is just an implementation detail of a particular VM and not something guaranteed. If that is right, then to get guaranteed safe initialization of an entire class, all fields should be final (or volatile).
Long story short, in this (non-perf-critical) case I am advocating for "Safe DCL", meaning the code below. And in performance critical code, "Unsafe Local DCL + Safe Singleton" could make sense, although it's worth verifying. The linked post above is 4 years old and it just did one set of tests, but, in that test that pattern didn't run any faster than "Safe DCL" or "Safe Local DCL" on x86.
private final AWSCredentialsConfig config;
@Nullable
private volatile FileSessionCredentialsProvider provider;
public LazyFileSessionCredentialsProvider(AWSCredentialsConfig config)
{
this.config = config;
}
private FileSessionCredentialsProvider getUnderlyingProvider()
{
if (provider == null) {
synchronized (config) {
if (provider == null) {
provider = new FileSessionCredentialsProvider(config.getFileSessionCredentials());
}
}
}
return provider;
}There was a problem hiding this comment.
Also since volatile you don't need that "safely initialized" comment on the field write line, but you need to write a similar comment on the field declaration, saying that the field is declared volatile in order to ensure safe publication of the object without worrying about final modifiers on the fields of the created object.
There was a problem hiding this comment.
Here is the implementation with reference to Effective Java, Second Edition Item 71: Use lazy initialization judiciously:
private final AWSCredentialsConfig config;
/* The field is declared volatile in order to ensure safe publication of the object
without worrying about final modifiers on the fields of the created object */
@Nullable
private volatile FileSessionCredentialsProvider provider;
public LazyFileSessionCredentialsProvider(AWSCredentialsConfig config)
{
this.config = config;
}
//FileSessionCredentialsProvider is safely initialized because it contains a final field sessionCredentials.
private FileSessionCredentialsProvider getUnderlyingProvider() {
FileSessionCredentialsProvider syncedProvider = provider;
if (syncedProvider == null) {
synchronized (config) {
syncedProvider = provider;
if (syncedProvider == null) {
syncedProvider = new FileSessionCredentialsProvider(config.getFileSessionCredentials());
provider = syncedProvider;
}
}
}
return syncedProvider;
}There was a problem hiding this comment.
👍 except the final thing, the comment should be a javadoc comment:
/**
* The field is declared volatile in order to ensure safe publication of the object
* in {@link #getUnderlyingProvider()} without worrying about final modifiers
* on the fields of the created object
*/There was a problem hiding this comment.
This line should also be removed:
//FileSessionCredentialsProvider is safely initialized because it contains a final field sessionCredentials.There was a problem hiding this comment.
To extend Gian's point from #6662 (comment), there is another reason why I think that double-checked locking without volatile should be prohibited in Druid entirely. When some code is added that bypasses the initialization path, and only checks that the field is initialized, a-la
if (lazilyInitializedField != null) {
doSomethingWith(lazilyInitializedField);
}There is a race and possibility of NPE. The second read of the field might result in null, despite the first read returned non-null and the check was passed. The correct code is:
Foo lazilyInitializedField = this.lazilyInitializedField;
if (lazilyInitializedField != null) {
doSomethingWith(lazilyInitializedField);
}But I think it's unrealistic to expect this level of attention to details and discipline from all developers and reviewers.
There was a problem hiding this comment.
This class also requires changes, similar to points 2. and 3. LazyFileSessionCredentialsProvider.
For p. 2, please add a comment that the returned compiled script object is an anonymous inner class, therefore it has an implicit final field (a reference to the object of the enclosing class), therefore it is safely initialized without volatile on compiledScript.
For p. 3, checkAndCompileScript() should be made to return the compiledScript object and that object must be used when checkAndCompileScript() is called, instead of re-reading compiledScript.
There was a problem hiding this comment.
Same as in JavaScriptAggregatorFactory
There was a problem hiding this comment.
Same as in JavaScriptAggregatorFactory
There was a problem hiding this comment.
Please add @Nullable to the provider field
There was a problem hiding this comment.
Please add @Nullable to compiledScript
There was a problem hiding this comment.
Please add @Nullable to predicateFactory.
2adbacf to
a9746d8
Compare
There was a problem hiding this comment.
Could you please add dependency to org.checkerframework:checker-qual to the project and annotate this field @MonotonicNonNull?
There was a problem hiding this comment.
Do you mean: https://checkerframework.org/api/org/checkerframework/checker/nullness/qual/MonotonicNonNull.html This is not included at druid-processing dependencies?
There was a problem hiding this comment.
Just for this part, documentation says that:
Use of @MonotonicNonNull on a static field is a code smell: it may indicate poor design. You should consider whether it is possible to make the field a member field that is set in the constructor.
There was a problem hiding this comment.
Same problem applies to interface import. Here is a related explanation: https://stackoverflow.com/questions/29097160/why-are-nonnull-annotation-on-fqn-not-allowed On demand static import solves the problem as follows, otherwise it gives that error:
Static Member Qualifying Type may Not be Annotated
@MonotonicNonNull
@Nullable
private volatile ScriptAggregator compiledScript;There was a problem hiding this comment.
It shouldnt have MonotonicNonNull AND Nullable. Just MonotonicNonNull
There was a problem hiding this comment.
Please make it to return a value. Rename to getCompiledScript(). @EnsuresNonNull("compiledScript")
There was a problem hiding this comment.
All the same comments as in the prev. classes
There was a problem hiding this comment.
All the same comments as in the prev. classes
2c7f4fe to
8e9b971
Compare
|
@kamaci note that you must not force push PR branch: https://github.com/apache/incubator-druid/blob/master/CONTRIBUTING.md#if-your-pull-request-shows-conflicts-with-master |
0be7cc1 to
de26d3a
Compare
1d74d03 to
e36b84b
Compare
👍 |
I feel like maybe this is a bit less important now since github has added history for force pushes https://blog.github.com/changelog/2018-11-15-force-push-timeline-event/, at least as long as squashing/fixing are not done, unless there are other reasons I'm missing. |
| /** | ||
| * The field is declared volatile in order to ensure safe publication of the object | ||
| * in {@link #getUnderlyingProvider()} without worrying about final modifiers | ||
| * on the fields of the created object |
There was a problem hiding this comment.
I realised that unlikely that many readers of this code will understand this passage without the reference to the thread. Please add , see https://github.com/apache/incubator-druid/pull/6662#discussion_r237013157 in the end
| public Aggregator factorize(final ColumnSelectorFactory columnFactory) | ||
| { | ||
| checkAndCompileScript(); | ||
| getCompiledScript(); |
There was a problem hiding this comment.
This method should be used instead of compiledScript a few lines below, that was the whole point. Same in other places where getCompiledScript() is called.
| /** | ||
| * The field is declared volatile in order to ensure safe publication of the object | ||
| * in {@link #compileScript(String, String, String)} without worrying about final modifiers | ||
| * on the fields of the created object |
There was a problem hiding this comment.
, see https://github.com/apache/incubator-druid/pull/6662#discussion_r237013157
| /** | ||
| * The field is declared volatile in order to ensure safe publication of the object | ||
| * in {@link #compile(String)} without worrying about final modifiers | ||
| * on the fields of the created object |
There was a problem hiding this comment.
, see https://github.com/apache/incubator-druid/pull/6662#discussion_r237013157
| * script compilation. | ||
| */ | ||
| @EnsuresNonNull("fn") | ||
| private void checkAndCompileScript() |
There was a problem hiding this comment.
Please convert to lazy getter and use it as in the previous class.
There was a problem hiding this comment.
All the same comments as in the prev. classes
There was a problem hiding this comment.
All the same comments as in the prev. classes
| Preconditions.checkState(config.isEnabled(), "JavaScript is disabled"); | ||
|
|
||
| Function syncedFn = fn; | ||
| if (syncedFn == null) { |
There was a problem hiding this comment.
@leventov Seems that here is always null at new case. Do I miss anything?
There was a problem hiding this comment.
I don't understand you message.
There was a problem hiding this comment.
Seems that first check for syncedFn
if (syncedFn == null)is unnecessary.
On the other hand, I've just sent another commit to address your comments.
| public Aggregator factorize(final ColumnSelectorFactory columnFactory) | ||
| { | ||
| checkAndCompileScript(); | ||
| compiledScript = getCompiledScript(); |
There was a problem hiding this comment.
It reassigns the field again, that is not needed. Either those lines should be
JavaScriptAggregator.ScriptAggregator compiledScript = getCompiledScript();Or (IMO) it's simpler just to use getCompiledScript() without introducing a local variable, in methods where it's needed only once.
Same below
| public Object compute(Map<String, Object> combinedAggregators) | ||
| { | ||
| checkAndCompileScript(); | ||
| fn = getCompiledScript(); |
There was a problem hiding this comment.
Same as above, either introduce a local variable or just use getCompiledScript() in the expression, but don't reassign the volatile field.
| public String apply(@Nullable Object value) | ||
| { | ||
| checkAndCompileScript(); | ||
| fn = getCompiledScript(); |
| * on the fields of the created object | ||
| */ | ||
| @MonotonicNonNull | ||
| private volatile Function<Object, String> fn; |
There was a problem hiding this comment.
Mention https://github.com/apache/incubator-druid/pull/6662#discussion_r237013157
| * The field is declared volatile in order to ensure safe publication of the object | ||
| * in {@link JavaScriptPredicateFactory(String, ExtractionFn)} without worrying about final modifiers | ||
| * on the fields of the created object | ||
| */ |
There was a problem hiding this comment.
Mention https://github.com/apache/incubator-druid/pull/6662#discussion_r237013157
| public Filter toFilter() | ||
| { | ||
| checkAndCreatePredicateFactory(); | ||
| predicateFactory = getPredicateFactory(); |
| public BufferAggregator factorizeBuffered(final ColumnSelectorFactory columnSelectorFactory) | ||
| { | ||
| checkAndCompileScript(); | ||
| compiledScript = getCompiledScript(); |
There was a problem hiding this comment.
Still reassign. Also other occurrences in this PR.
|
@kamaci please merge master into your PR branch to make CI happy |
|
However, not needed, CI passed itself |
|
👍 |
* Double-checked locking bug is fixed. * @nullable is removed since there is no need to use along with @MonotonicNonNull. * Static import is removed. * Lazy initialization is implemented. * Local variables used instead of volatile ones. * Local variables used instead of volatile ones.
Using double-checked locking for the lazy initialization of any other type of primitive or mutable object risks a second thread using an uninitialized or partially initialized member while the first thread is still creating it, and crashing the program.