[SPARK-15765][SQL][Streaming] Make continuous Parquet writes consistent with non-continuous Parquet writes #13507
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Currently there are some code duplicates in continuous Parquet writes (as in Structured Streaming) and non-continuous writes; see ParquetFileFormat#prepareWrite() and ParquetFileFormat#ParquetOutputWriterFactory.
This may lead to inconsistent behavior, when we only change one piece of code but not the other.
By extracting the common code out, this patch fixes the inconsistency. As a result, Structured Streaming now also enjoys SPARK-15719.
How was this patch tested?
Just code refactoring without any logic change; this should be covered by existing suites.