Only initiate PatchPoint when needed.#565
Conversation
badGeneratedData.10n
Outdated
There was a problem hiding this comment.
Is this intentionally committed?
There was a problem hiding this comment.
Is this intentionally committed?
This shouldn't be committed, will be removed in the next commit.
| ContainerInfo containerInfo = containerStack.push(); | ||
| containerInfo.type = valueType; | ||
| containerInfo.endPosition = valueEndPosition; | ||
| containerStack.push(c -> c.initialize(valueType, valueEndPosition)); |
There was a problem hiding this comment.
This is a change that would warrant performance testing of binary incremental reading to ensure it does not introduce a regression. Do we see extra overhead introduced by creation and invocation of a lambda on each stepIn?
There was a problem hiding this comment.
I will check if the change introduce regression on binary incremental reading.
There was a problem hiding this comment.
Here is the benchmark results of benchmarking incremental reader with 4 test data, and the results are neutral.
| Test Data | Before | After |
|---|---|---|
| log_59155.ion | 242.721 | 249.478 |
| log_194617.ion | 1858.668 | 1854.447 |
| test.json | 4273.663 | 4266.974 |
| catalog.ion | 0.476 | 0.478 |
| singleValue.10n | 0.063 | 0.063 |
| { | ||
| // If we're adding a patch point we first need to ensure that all of our ancestors (containing values) already | ||
| // have a patch point. No container can be smaller than the contents, so all outer layers also require patches. | ||
| ListIterator<ContainerInfo> stackIterator = containers.iterator(); |
There was a problem hiding this comment.
This allocates a new $Iterator every time. It looks like we could store and reuse a single instance, which may improve performance.
|
Here is the benchmark comparison results between commit ad7c0f9 and 05acf07. According to the results, there is 0.78% performance improvements using test data generatedContainerOnlyTestData.10n, 1.13% performance improvements using test data log_59155.ion, and 0.69% performance improvements using test data log_194617.ion.
Here is the overall benchmark results after the change from commit 05acf07. (We re-run ion-java-benchmark-cli on every change to get this version of benchmarking results)
|
| stackIterator = new $Iterator(); | ||
| return stackIterator; |
There was a problem hiding this comment.
I think this will be cleaner if in this method, you
- Allocate the iterator only if it has not yet been allocated, and
- Reset the iterator
That way you don't need a public resetIterator method, which you call every time you need to retrieve the iterator anyway. Also, you don't need to store a stackIterator variable in IonRawBinaryWriter, as retrieving it via containers.iterator() where you need it will be sufficient. In other words, iterator reuse can be achieved without any changes to IonRawBinaryWriter.
There was a problem hiding this comment.
Thanks for the suggestions. I experimentally updated method iterator() with the following change:
Conditional initialization:
public ListIterator<T> iterator() {
if (stackIterator != null) {
stackIterator.cursor = _Private_RecyclingStack.this.currentIndex;
} else {
stackIterator = new $Iterator();
}
return stackIterator;
}
One concern I have for this change is we need to run the if...else... condition check every time we call containers.iterator().
Unconditional initialization:
Then I tried to initialize $Iterator() while constructing the Recycling_Stack and iterator() method only reset the cursor.
Unconditional initialization:
public final class _Private_RecyclingStack<T> implements Iterable<T> {
private $Iterator stackIterator;
public ListIterator<T> iterator() {
stackIterator.cursor = _Private_RecyclingStack.this.currentIndex;
return stackIterator;
}
//Here is where we initialize the `stackIterator`
private $Iterator stackIterator;
public _Private_RecyclingStack(int initialCapacity, ElementFactory<T> elementFactory) {
elements = new ArrayList<>(initialCapacity);
this.elementFactory = elementFactory;
currentIndex = -1;
top = null;
stackIterator = new $Iterator();
}
This change might cause unnecessary iterator allocation while we do not need to iterate the stack. From the benchmark results the first method (conditional initialize iterator) is more performant. Would there be any other alternative implementations that I wasn't aware of? Thanks
| Test Data | conditional allocation | unconditional allocation |
|---|---|---|
| log_59155.ion | 496.576 | 501.664 |
| log_194617.ion | 3924.792 | 3965.73 |
There was a problem hiding this comment.
The if/else is only a problem if it results in a noticeable performance degradation, but it doesn't look like it does. It's surprising that unconditional allocation is slower, but I think the conditional allocation is fine.
Issue #, if available:
N/A
Note: This is the implementation on top of PR #521
Description of changes:
Previously, we allocated a placeholder
patchpointfor every container. During thepopContainerprocess, these placeholders were assigned the appropriatepatchpointvalues. However, for conditions that didn't require apatchpoint, we had to reclaim these placeholder patchpoints. To eliminate the need to reclaim unused placeholder patchpoints, we implemented the following changes:Instead of initializing a placeholder
patchpointfor each container, we now maintain an index of thepatchpointassociated with each container. By default, thepatchpointindex is set to-1, indicating that nopatchpointvalue has been assigned to that container.During the
popContainerprocess, child containers are popped first, while their parent containers remain in the stack. If the current container (the child container) meets a condition that requires apatchpoint, it implies that its ancestors also need patchpoints. At this point, we trace back to its ancestors and allocate a placeholderpatchpointto each ancestor and assignpatchIndexfor each ancestor container. This continues until we encounter an ancestor with an assignedpatchpoint. We will replace the placeholder values with the correct data while we pop the ancestors. In order to test if the new changes show the expected performance improvements, we also include the benchmark results from using a generated test data to benchmark the new implementation. The generated test data contains a stream of500000nested container values, and all container requirespatchpointallocation.Results summary:
When we benchmark the new implementation with the container-only test data, we found
7.12%performance regression of the current implementation comparing to the previous implementation(patch list as a single contiguous array). However, when we use the real world data for benchmarking, we found that comparing to the previous implementation there is5.54%speed improvement using test datalog_59155.ion, and6.12%speed improvement using test datalog_194617.ion. Overall, after the new implementations on patchpoints, we gained1.84%improvement comparing to the original implementation on datasetlog_59155.ion,4.38%improvement on datasetlog_194617.ionand28.23%on datasetgeneratedContainerOnlyTestData.10n.Next step:
For the next step, we should investigate on finding the reason that caused performance regression while benchmarking with the container-only test data. By comparing the profiling results from those two implementations, we will find more insights on how to improve the current implementation.
Full benchmark results :
Benchmark a write of data equivalent to a of stream of 500000 nested container values using IonWriter(binary). The output data will write into an in-memory buffer. (3 forks, 2 warmups, 2 iterations, preallocation 1)
Benchmark a write of data equivalent to a of stream of 194617 nested binary data using IonWriter(binary). The output data will write into an in-memory buffer. (3 forks, 2 warmups, 2 iterations, preallocation 1)
Benchmark a write of data equivalent to a stream of 59155 nested binary Ion values. The output data will write into an in-memory buffer. (3 forks, 2 warmups, 2 iterations, preallocation 1)
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.