Adds a pool of UTF-8 decoders, making reader instantiation less expensive.#388
Adds a pool of UTF-8 decoders, making reader instantiation less expensive.#388
Conversation
Codecov Report
@@ Coverage Diff @@
## master #388 +/- ##
============================================
+ Coverage 66.32% 66.35% +0.02%
- Complexity 5348 5360 +12
============================================
Files 148 154 +6
Lines 22602 22619 +17
Branches 4084 4083 -1
============================================
+ Hits 14990 15008 +18
Misses 6252 6252
+ Partials 1360 1359 -1
Continue to review full report at Codecov.
|
|
I reviewed the results of the failing performance regression detector task, and I'm comfortable with them. It detected a very small regression (0.847ms vs 0.860ms, 0.793ms vs 0.803 ms) in two of the three binary read tests, and an improvement in the third (0.926ms vs 0.671ms). In all cases, the slower data point had a super high % error (>40%), indicating either that something may have been competing for resources on the host or that we need to run more iterations, more warmups, throw out runs with a % error above a certain threshold, etc. Note: these tests currently instantiate a single reader, read a small amount of data (~50KB), and close the reader. No performance impact is expected; I confirmed this locally with my ion-java-benchmark-cli runs. |
| * Base class for types that may be pooled. | ||
| * @param <T> the concrete type. | ||
| */ | ||
| abstract class Poolable<T extends Poolable<T>> implements Closeable { |
There was a problem hiding this comment.
This type bound is to ensure that only subclasses of Poolable can implement Poolable, right? Which provides the guarantee that you'll get instantiated with reference to an appropriate Pool?
There was a problem hiding this comment.
Yeah, it allows us to guarantee that we're getting a Pool<T> (i.e. a Pool for this type of Poolable) in the constructor.
Issue #, if available:
Fixes #370
Description of changes:
Poolable- the negative diff forUtf8StringEncodershows what was extracted) and for pools of poolable objects (Pool- the negative diff forUtf8StringEncoderPoolshows what was extracted).Utf8StringDecoder(extendsPoolable) andUtf8StringDecoderPool(extendsPool) to provide reusable logic and character buffers for decoding UTF-8 strings.IonReaderBinaryIncremental(the new incremental binary reader) andIonReaderBinaryRawX(the old binary reader) to useUtf8StringDecoderPoolandUtf8StringDecoder.IonReaderBinaryRawXrequires an additionalByteBufferfor its string decoding (not required by the incremental reader), createdPoolableByteBuffer(extendsPoolable) andByteBufferPool(extendsPool) to provide reusableByteBuffers to this reader.This takes care of a lot of the expense that comes with instantiating binary readers. To test, I benchmarked a loop like the following on a binary Ion file containing only
"foo"(using a manual change toion-java-benchmark-clithat I plan to add as a proper option at some point):Results before the proposed change:
Results after the change:
I tried with various other file sizes as well, and observed (as expected) that the benefit diminishes as the size of the stream increases and more time is spent on parsing than reader instantiation.
I'm going to continue to experiment with pooling in the readers and writers. At the extreme, we could pool an object that holds everything readers and writers need. I'll determine whether the performance benefits would be worth the complexity.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.