Skip to content

Adds a pool of UTF8 String encoders#369

Merged
zslayton merged 3 commits intomasterfrom
utf8-encoder-pool
Jun 23, 2021
Merged

Adds a pool of UTF8 String encoders#369
zslayton merged 3 commits intomasterfrom
utf8-encoder-pool

Conversation

@zslayton
Copy link
Contributor

Most of the expense involved in constructing new binary writers
comes from allocating/initializing the buffers needed to encode
Java's UTF-16 Strings to UTF-8.

This change refactors the UTF-8 encoding logic into its own
class (Utf8StringEncoder) and introduces a singleton
Utf8StringEncoderPool that allows these encoders to be reused
across instantiations of binary writers.

Benchmark

This test initializes a new binary writer, writes a small string ("foo"), then closes the writer in a tight loop. The source can be found here.

Before

Benchmark                             Score           Error   Units
time                                 37.664 ±         3.661   ms/op
·gc.alloc.rate                      773.095 ±         3.882  MB/sec
·gc.alloc.rate.norm           438323910.400 ±    366718.519    B/op
·gc.churn.G1_Eden_Space             739.958 ±       238.206  MB/sec
·gc.churn.G1_Eden_Space.norm  419640115.200 ± 135676093.820    B/op
·gc.churn.G1_Old_Gen                  0.017 ±         0.033  MB/sec
·gc.churn.G1_Old_Gen.norm          9425.600 ±     18474.404    B/op
·gc.count                            20.000                  counts
·gc.time                             13.000                      ms

After

Benchmark                             Score           Error   Units
time                                 17.788 ±         3.631   ms/op
·gc.alloc.rate                       43.899 ±         0.850  MB/sec
·gc.alloc.rate.norm            24007104.000 ±    489259.500    B/op
·gc.churn.G1_Eden_Space              57.248 ±       182.466  MB/sec
·gc.churn.G1_Eden_Space.norm   31457280.000 ± 100263003.914    B/op
·gc.count                             2.000                  counts
·gc.time                              4.000                      ms

Following this change, the benchmark:

  • Took 52.77% less time to run.
  • Reduced its allocation rate from ~773MB/sec to ~44MB/sec (94.32% less)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Most of the expense involved in constructing new binary writers
comes from allocating/initializing the buffers needed to encode
Java's UTF-16 Strings to UTF-8.

This change refactors the UTF-8 encoding logic into its own
class (Utf8StringEncoder) and introduces a singleton
Utf8StringEncoderPool that allows these encoders to be reused
across instantiations of binary writers.
@codecov
Copy link

codecov bot commented Jun 23, 2021

Codecov Report

Merging #369 (1ebb53d) into master (ca85095) will increase coverage by 0.02%.
The diff coverage is 94.73%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #369      +/-   ##
============================================
+ Coverage     64.05%   64.08%   +0.02%     
- Complexity     4837     4844       +7     
============================================
  Files           136      138       +2     
  Lines         21108    21142      +34     
  Branches       3821     3822       +1     
============================================
+ Hits          13521    13548      +27     
- Misses         6251     6256       +5     
- Partials       1336     1338       +2     
Impacted Files Coverage Δ
...om/amazon/ion/impl/bin/utf8/Utf8StringEncoder.java 91.89% <91.89%> (ø)
...rc/com/amazon/ion/impl/bin/IonRawBinaryWriter.java 91.32% <100.00%> (-0.20%) ⬇️
...mazon/ion/impl/bin/utf8/Utf8StringEncoderPool.java 100.00% <100.00%> (ø)
src/com/amazon/ion/impl/BlockedBuffer.java 50.72% <0.00%> (-0.49%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ca85095...1ebb53d. Read the comment docs.

private static final byte VARINT_NEG_ZERO = (byte) 0xC0;

// See IonRawBinaryWriter#writeString(String) for usage information.
static final int SMALL_STRING_SIZE = 4 * 1024;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the code removed from this file (IonRawBinaryWriter.java) was moved to the new Utf8StringEncoder class.

Comment on lines +126 to +128
final Utf8StringEncoder utf8StringEncoder = Utf8StringEncoderPool
.getInstance()
.getOrCreateUtf8Encoder();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than allocating several new arrays for each binary writer we construct, we simply pull a Utf8StringEncoder from the pool.

* @return A {@link Result} containing a byte array of UTF-8 bytes and encoded length.
* @throws IllegalArgumentException if the String cannot be encoded as UTF-8.
*/
public Result encode(String text) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The encoding logic in this method was migrated without changes.

patchBuffer.close();
allocator.close();
// We cannot use `utf8StringEncoder` again after returning it to the pool.
Utf8StringEncoderPool.getInstance().returnEncoderToPool(utf8StringEncoder);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the writer is close()d, return the Utf8StringEncoder to the pool.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something to consider, is if the encoder knows the pool that it came from it can implement a close method and return itself. This way you could just have the caller inject it and avoid having the singleton of the pool be known in the implementation. It is marginally cleaner (e.g. allows for the ability to turn off the pool, if there is some weird multi-threaded thrashing issue), but given how internal all of this stuff is that may or may not be worthwhile.

Another thing to consider is to make the pool injected versus hard coded as a singleton--that would give the caller the flexibility to potentially no-op the "construction" and/or "return".

Again, minor considering how this code is used, but we have seen issues with global singletons and threaded applications sometimes require turning off these concurrent shared things.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with both of the considerations raised here.

Copy link
Contributor

@tgregg tgregg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Can you also do a benchmark where instead of creating many writers that each write a single string, you create one writer that writes many strings? That way we can verify no impact to that use case.

Comment on lines 12 to 13
// A singleton instance.
private static final Utf8StringEncoderPool INSTANCE = new Utf8StringEncoderPool();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider making Utf8StringEncoderPool an enum with a single value: INSTANCE.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh! TIL. Will do.

private static final Utf8StringEncoderPool INSTANCE = new Utf8StringEncoderPool();

// A queue of previously initialized encoders that can be loaned out.
ArrayBlockingQueue<Utf8StringEncoder> bufferQueue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private final ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

almann
almann previously approved these changes Jun 23, 2021
Copy link
Contributor

@almann almann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor points/question below.

patchBuffer.close();
allocator.close();
// We cannot use `utf8StringEncoder` again after returning it to the pool.
Utf8StringEncoderPool.getInstance().returnEncoderToPool(utf8StringEncoder);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something to consider, is if the encoder knows the pool that it came from it can implement a close method and return itself. This way you could just have the caller inject it and avoid having the singleton of the pool be known in the implementation. It is marginally cleaner (e.g. allows for the ability to turn off the pool, if there is some weird multi-threaded thrashing issue), but given how internal all of this stuff is that may or may not be worthwhile.

Another thing to consider is to make the pool injected versus hard coded as a singleton--that would give the caller the flexibility to potentially no-op the "construction" and/or "return".

Again, minor considering how this code is used, but we have seen issues with global singletons and threaded applications sometimes require turning off these concurrent shared things.

Comment on lines 9 to 10
// The maximum number of Utf8Encoders that can be waiting in the queue before new ones will be discarded.
private static final int MAX_QUEUE_SIZE = 32;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, was this just a small number that seemed reasonable or did you get this number from something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a number that seemed reasonable. In the worst case, an application that had more than 32 binary writers in existence at the same time would be allocating fresh Utf8StringEncoders for the surplus, which seemed low stakes.

That said, raising the ceiling is pretty cheap, though. Each Utf8StringEncoder is something like ~20KB on the heap and the queue only allocates them as needed, so I might bump this up to 128 while I'm making tweaks.

marcbowes
marcbowes previously approved these changes Jun 23, 2021
jobarr-amzn
jobarr-amzn previously approved these changes Jun 23, 2021
patchBuffer.close();
allocator.close();
// We cannot use `utf8StringEncoder` again after returning it to the pool.
Utf8StringEncoderPool.getInstance().returnEncoderToPool(utf8StringEncoder);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with both of the considerations raised here.

/**
* A thread-safe shared pool of {@link Utf8StringEncoder}s that can be used for UTF8 encoding and decoding.
*/
public class Utf8StringEncoderPool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not make this generic? Nothing about this class seems to be specific to the UTF-8 encoder use case (assuming that you drop the singleton pattern, but even then you can have a singleton UTF-8 encoder pool that is an instantiation of a generic pool).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern may be useful for more cases than just UTF-8 string encoding- do we have any other object pools in ion-java?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern may be useful for more cases than just UTF-8 string encoding- do we have any other object pools in ion-java?

There's the PooledBlockAllocatorProvider, but I believe that's it. (Depending on your perspective, the RecyclingStack might qualify?) At any rate, I plan to tackle #370 next, at which point we'll definitely have an opportunity for reuse.

Why not make this generic? Nothing about this class seems to be specific to the UTF-8 encoder use case (assuming that you drop the singleton pattern, but even then you can have a singleton UTF-8 encoder pool that is an instantiation of a generic pool).

Agreed. I'll do this as part of the PR for #370 if you don't mind.

@tgregg
Copy link
Contributor

tgregg commented Jun 23, 2021

Note: the binary reader also keeps per-instance buffers for string decoding, and could probably benefit similarly from using a pool. https://github.com/amzn/ion-java/blob/master/src/com/amazon/ion/impl/IonReaderBinaryRawX.java#L118

I created #370 for this.

@zslayton
Copy link
Contributor Author

@tgregg said:

Can you also do a benchmark where instead of creating many writers that each write a single string, you create one writer that writes many strings? That way we can verify no impact to that use case.

Test data: a text Ion file with 10,000 repetitions of the top level string "brevity is the soul of wit" created with:

yes '"brevity is the soul of wit"' | head -n 10000 > /tmp/brevity.ion

ion-java-benchmark-cli command:

java -jar ion-java-benchmark-cli_VERSION.jar write --forks 2 /tmp/brevity.ion

Before (v1.8.2)

Benchmark                 Score      Error   Units
time                      2.418 ±    0.588   ms/op
:Heap usage               4.114 ±    0.005      MB
:Serialized size          0.280                 MB
:·gc.alloc.rate           0.106 ±    0.005  MB/sec
:·gc.alloc.rate.norm  56423.200 ± 2853.160    B/op
:·gc.count                  ≈ 0             counts

After (this branch)

Benchmark                 Score      Error   Units
time                      2.426 ±    0.458   ms/op
:Heap usage               4.156 ±    0.005      MB
:Serialized size          0.280                 MB
:·gc.alloc.rate           0.028 ±    0.005  MB/sec
:·gc.alloc.rate.norm  15007.200 ± 2810.004    B/op
:·gc.count                  ≈ 0             counts

tgregg
tgregg previously approved these changes Jun 23, 2021
@zslayton zslayton dismissed stale reviews from tgregg, jobarr-amzn, marcbowes, and almann via 1ebb53d June 23, 2021 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants