Features: ClosableCharSequence and support for custom charset decoder + 2 fixes by akhomchenko · Pull Request #9 · fge/largetext

akhomchenko · 2021-04-26T07:40:38Z

Evening.

I used you library and it works great. I found it lacking 2 features and I decided to contribute them back. Also I found 2 issues that I wrote patches for and decided to contribute them back as well. I can split this into multiple PRs if you want to include only parts of it.

Feature 1: `ClosableCharSequence`

This change enabled me to hide implementation detail and back reading different sources behind single interface. It is breaking as I have replaced Closable, CharSequence with ClosableCharSequence. I am willing to work on a better strategy for this one.

Feature 2: support for custom charset decoder

This change enabled me to handle possible decoding errors better and provide similar behavior for different file-reading implementations. I am not an expert in encodings so please double-check.

Concurrent issues

Discovered both issues at the time I was running tests multiple times while working on "support for custom charset decoder" feature.

Notes

I've tried to add tests whenever it was possible and best to my knowledge. Tests for concurrent issues were not written but I have documented steps on how to reproduce them.
Includes fix update gradle wrapper to 6.7.1 #8. I will rebase if update gradle wrapper to 6.7.1 #8 will be merged beforehand.
Includes fix for EmptyCharSequence#toString similar to Fix EmptyCharSequence returning 'INSTANCE' as it's toString representation #6. Test is different in my case but I am happy to rebase my code on top of that PR if you plan to merge it.

Change includes: * https instead of http for spring repo as http endpoint is deprecated * move properties after plugins as otherwise some properties are not resolved * replace '<<' with `doLast` as '<<' is deprecated since Gradle 5.0 * remove osgi plugin as it is deprecated and I am not sure it does something useful * remove wrapper task as updates can be done via built-in gradle wrapper task * replace broken javadoc links with working

ClosableCharSequence maintains same properties as Closable & CharSequence and is more convenient for cases when not only LargeText is used. For example: for files < 10Mb code can just read to CharSequence while LargeText is used for anything above threshold.

Default (REPORT) action of CharsetDecoder is not always desired. With new API supplier of CharsetDecoder can be provided. This allows to customize CharsetDecoder to user needs. TextDecoder#nextRange was update to account for different CodingErrorActions. endOfInput for decode method is now determined dynamically. This allows to get access to UNDERFLOW CoderResult and avoid replacement of 1+ byte character that happened to be at the upper bound of range. NotThreadSafeLargeTextTest was updated to test all supported CodingErrorActions. Strings.repeat in tests were replaced with RandomStringUtils from apache-commons lang3. RandomStringUtils generate string with characters of various byte length (1-4 for UTF8) that is better for randomized testing than fixed 1 byte string that was used before. Also: * updated guava to 30.1-android (guava versions 21+ require jdk8, android flavor is compatible with jdk7, see: https://github.com/google/guava/wiki/Compatibility#older-jdks) * updated assertj-core to latest 2.x version (3.x is not compatible with jdk8)

…ption

…ore write

akhomchenko added 8 commits January 12, 2021 18:36

fix EmptyCharSequence#toString to return empty String

57b5fdc

add steps to reproduce getRanges ConcurrentModificationException

7cc57a5

move copyOf to synchronized block to avoid ConcurrentModificationExce…

8d3b805

…ption

add steps to reproduce charAt NullPointerException caused by read bef…

7876c7e

…ore write

fix CharWaiters notification prior to ranges update that causes NPE

bcf450c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Features: ClosableCharSequence and support for custom charset decoder + 2 fixes#9

Features: ClosableCharSequence and support for custom charset decoder + 2 fixes#9
akhomchenko wants to merge 8 commits intofge:masterfrom
akhomchenko:feature/custom-charset-decoder

akhomchenko commented Apr 26, 2021 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

akhomchenko commented Apr 26, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Feature 1: ClosableCharSequence

Feature 2: support for custom charset decoder

Concurrent issues

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

akhomchenko commented Apr 26, 2021 •

edited

Loading

Feature 1: `ClosableCharSequence`