Skip to content

Conversation

@scwhittle
Copy link
Contributor

This replaces the current implementation of a custom output stream wrapped by standard PrintStream, removing a possible deadlock between the PrintStream and handler. The
deadlock can occur if something beneath the handler lock attempts to use System.err, reversing
the normal locking order.

By using PrintStream we can use a StringBuffer instead of a ByteBuffer, which also avoids extra encoding and decoding between byte arrays and Strings.


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

Post-Commit Tests Status (on master branch)

Lang SDK Apex Dataflow Flink Gearpump Samza Spark
Go Build Status --- --- Build Status --- --- Build Status
Java Build Status Build Status Build Status
Build Status
Build Status
Build Status
Build Status
Build Status Build Status Build Status
Build Status
Build Status
Python Build Status
Build Status
Build Status
Build Status
--- Build Status
Build Status
Build Status
Build Status
Build Status
--- --- Build Status
XLang --- --- --- Build Status --- --- Build Status

Pre-Commit Tests Status (on master branch)

--- Java Python Go Website
Non-portable Build Status Build Status
Build Status
Build Status Build Status
Portable --- Build Status --- ---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

@scwhittle
Copy link
Contributor Author

R: @lukecwik
There are some questions in the change around desired behavior printing char arrays and if we should attempt to detect newlines in strings/arrays passed to print.

Copy link
Member

@kennknowles kennknowles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed pull request description. I was able to understand the reason for this and actually able to do a review. Is there a way to test this? (not the deadlock or lack thereof but just the basic functionality)

Copy link
Member

@lukecwik lukecwik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to check for \n for any object that could contain it.

Copy link
Member

@lukecwik lukecwik Mar 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be an issue for multi-byte wide characters that have been split?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a CharsetDecoder that is used for this method

Copy link
Member

@lukecwik lukecwik Mar 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either use assert !Thread.holdsLock(this) : "BEAM-9399: This thread should never hold this lock"; or checkState(!Thread.holdsLock(this), "BEAM-9399: This thread should never hold this lock"); to guard against this case.

https://docs.oracle.com/javase/7/docs/api/java/lang/Thread.html#holdsLock%28java.lang.Object%29

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

@lukecwik lukecwik Mar 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes a lot more sense to move all the newline handling into publish and for it to check for new lines. If you want to avoid the cost of indexOf/substring call, you could create an internal method that publishWithNewLines that does all the substring/indexOf work and only use publish for the trivial println methods

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous code removed trailing newlines when publishing, which makes sense to me since we don't need them in the final log message.
Is there a reason not to do it here? This is a single callsite and avoids making a substring.

The checking for newlines for flushing seems like it needs to be in each function since otherwise we wouldn't be calling publish.

@scwhittle
Copy link
Contributor Author

Thanks for the detailed pull request description. I was able to understand the reason for this and actually able to do a review. Is there a way to test this? (not the deadlock or lack thereof but just the basic functionality)
There is an existing test for this JulHandlerPrintStreamAdapterFactoryTest. Let me know if there are additions you would like there.

Copy link
Member

@lukecwik lukecwik Mar 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to use the stateful version of the CharsetDecoder.decode since this method assumes that the byte[] range represents a whole valid String.

In the other methods you would need to finish the decoding if there is a partial decoding in flight.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching. I added a unit test for raw bytes and fixed to keep a buffer of remainders.

@scwhittle scwhittle force-pushed the logging_deadlock branch 2 times, most recently from d0b54ab to 9651178 Compare March 19, 2020 21:40
@aaltay
Copy link
Member

aaltay commented Mar 27, 2020

retest this please

@lukecwik
Copy link
Member

Run Java PreCommit

@lukecwik
Copy link
Member

Run Spotless PreCommit

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have to use the default charset for getBytes since you use the default charset in the CharsetDecoder.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, done.

Copy link
Member

@lukecwik lukecwik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address my comment about using the default charset during testing and also address the spotless error described below:

14:06:02 > The following files had format violations:
14:06:02 runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/logging/JulHandlerPrintStreamAdapterFactory.java
14:06:02 @@ -144,7 +144,7 @@
14:06:02 ··············flush·=·true;
14:06:02 ············}
14:06:02 ············//·Keep·the·unread·bytes.
14:06:02 -············assert(incoming.remaining()·<=·carryOverByteArray.length);
14:06:02 +············assert·(incoming.remaining()·<=·carryOverByteArray.length);
14:06:02 ············while·(incoming.hasRemaining())·{
14:06:02 ··············carryOverByteArray[carryOverBytes++]·=·incoming.get();
14:06:02 ············}
14:06:02 Run 'gradlew spotlessApply' to fix these violations.

…Stream

instead of a custom output stream wrapped by standard PrintStream.
@scwhittle
Copy link
Contributor Author

I pushed fixes for both changes you requested but it isn't letting me close your changes requested . For the future, should push commits and only squash them once reviewing is complete? Thanks!

@lukecwik
Copy link
Member

retest this please

Copy link
Member

@lukecwik lukecwik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conditionally on tests being green.

@lukecwik
Copy link
Member

retest this please

@lukecwik
Copy link
Member

I pushed fixes for both changes you requested but it isn't letting me close your changes requested . For the future, should push commits and only squash them once reviewing is complete? Thanks!

You shouldn't squash commits because it makes it harder for the reviewer to see the diff between versions. Also it messes up parts of the comments/suggestion history. Once you get an LGTM, you can either squash and fix-up your commit history then or allow the reviewer to squash and merge your commits.

Copy link
Member

@lukecwik lukecwik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix:

08:19:51 /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_Commit/src/runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/logging/JulHandlerPrintStreamAdapterFactoryTest.java:122: error: cannot find symbol
08:19:51     byte[] bytes = msg.getBytes(Charset.defaultCharset());
08:19:51                                 ^
08:19:51   symbol:   variable Charset
08:19:51   location: class JulHandlerPrintStreamAdapterFactoryTest
08:19:51 /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_Commit/src/runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/logging/JulHandlerPrintStreamAdapterFactoryTest.java:129: error: cannot find symbol
08:19:51     byte[] newlineMsgBytes = newlineMsg.getBytes(Charset.defaultCharset());
08:19:51                                                  ^
08:19:51   symbol:   variable Charset

@scwhittle
Copy link
Contributor Author

Fixed. Sorry for that, I ran tests but must have been on the wrong branch.

@lukecwik
Copy link
Member

retest this please

@lukecwik
Copy link
Member

retest this please

@lukecwik
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants