Features: ClosableCharSequence and support for custom charset decoder + 2 fixes#9
Open
akhomchenko wants to merge 8 commits intofge:masterfrom
Open
Features: ClosableCharSequence and support for custom charset decoder + 2 fixes#9akhomchenko wants to merge 8 commits intofge:masterfrom
akhomchenko wants to merge 8 commits intofge:masterfrom
Conversation
Change includes: * https instead of http for spring repo as http endpoint is deprecated * move properties after plugins as otherwise some properties are not resolved * replace '<<' with `doLast` as '<<' is deprecated since Gradle 5.0 * remove osgi plugin as it is deprecated and I am not sure it does something useful * remove wrapper task as updates can be done via built-in gradle wrapper task * replace broken javadoc links with working
ClosableCharSequence maintains same properties as Closable & CharSequence and is more convenient for cases when not only LargeText is used. For example: for files < 10Mb code can just read to CharSequence while LargeText is used for anything above threshold.
Default (REPORT) action of CharsetDecoder is not always desired. With new API supplier of CharsetDecoder can be provided. This allows to customize CharsetDecoder to user needs. TextDecoder#nextRange was update to account for different CodingErrorActions. endOfInput for decode method is now determined dynamically. This allows to get access to UNDERFLOW CoderResult and avoid replacement of 1+ byte character that happened to be at the upper bound of range. NotThreadSafeLargeTextTest was updated to test all supported CodingErrorActions. Strings.repeat in tests were replaced with RandomStringUtils from apache-commons lang3. RandomStringUtils generate string with characters of various byte length (1-4 for UTF8) that is better for randomized testing than fixed 1 byte string that was used before. Also: * updated guava to 30.1-android (guava versions 21+ require jdk8, android flavor is compatible with jdk7, see: https://github.com/google/guava/wiki/Compatibility#older-jdks) * updated assertj-core to latest 2.x version (3.x is not compatible with jdk8)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Evening.
I used you library and it works great. I found it lacking 2 features and I decided to contribute them back. Also I found 2 issues that I wrote patches for and decided to contribute them back as well. I can split this into multiple PRs if you want to include only parts of it.
Feature 1:
ClosableCharSequenceThis change enabled me to hide implementation detail and back reading different sources behind single interface. It is breaking as I have replaced
Closable, CharSequencewithClosableCharSequence. I am willing to work on a better strategy for this one.Feature 2: support for custom charset decoder
This change enabled me to handle possible decoding errors better and provide similar behavior for different file-reading implementations. I am not an expert in encodings so please double-check.
Concurrent issues
Discovered both issues at the time I was running tests multiple times while working on "support for custom charset decoder" feature.
Notes
EmptyCharSequence#toStringsimilar to Fix EmptyCharSequence returning 'INSTANCE' as it's toString representation #6. Test is different in my case but I am happy to rebase my code on top of that PR if you plan to merge it.